Data scientists universally report that they spend at least 80% of their time finding data sets of interest, accessing them, cleaning them and assembling them into a unified whole.

Data Civilizer is an end-to-end project to lower the 80%.  It consists of sub-projects on data discovery (Aurum) view construction, data cleaning, data transformation and golden record construction.  A complete prototype is close to being operational.