As part of Data Civilizer we are designing abstractions and building tools and systems to help people with their data-related tasks, from discovering, to cleaning, to transforming it. The aim is to shape the data in a way that is easy to analyzer---for example to fit a model or fill in a report.
We are building a database for autonomous vehicle sensor data that addresses the challenges presented by the potential scale of autonomous vehicle data and the unique characteristics of the data.
The goal of the project is to make database management systems resilient to workload variations (e.g., load spikes due to news events) by enabling them to automatically expand and contract the size of the database cluster and balance load across servers.
Data scientists universally report that they spend at least 80% of their time finding data sets of interest, accessing them, cleaning them and assembling them into a unified whole.
This project shows that we can successfully use predictive modeling to enable a database cluster to elastically expand or contract in anticipation of changes in the workload.
The Systems CoR is focused on building and investigating large-scale software systems that power modern computers, phones, data centers, and networks, including operating systems, the Internet, wireless networks, databases, and other software infrastructure.