Enterprise Data Integration
A typical enterprise has 5000 on-line data bases, all designed independently. Moreover, they have a need to integrate the information in these data bases for business purposes. This data is inconsistent, incomplete, uses different terminology and is sometimes just plain wrong. In spite of these terminology and data quality issues, we need to do accurate integration, perhaps with a human assist. This exact integration challenge also arises in biological and chemistry data bases as well as in the health care arena.
The goal of this project is to build a data integration framework that can operate at scale, using a mix of machine learning and user input. If interested, contact Mike Stonebraker (firstname.lastname@example.org). A follow-on MEng project is very possible.