Scalable methods for fusing Multiple models generated for big data

When dealing with big data we generate thousands of models where each model specializes on a subset of the data. Once we generate these thousands of models we are developing techniques that are able to combine these multiple models by learning weights for fusing their predictions. The techniques range from simple average to weighted sum to probabilistic approaches. Known as ensemble learning these methods have been able to allow users to reach prediction accuracies higher than one single model.

See for example here.

We are however interested in using ensemble learning in the case of a large datasets. We are seeking a student that will work closely with a graduate student and a post-doc to scale our algorithms that learn weights and test our methodology on multiple big data sets including possibly the heritage health care prize.

MEng, Juniors and Seniors looking to lead to MEng via 6.UAT, UAP
Background: Course 6 courses in software and machine learning knowledge (6.034 and 6.867) 
Please contact: kalyan@csail.mit.edu or unamay@csail.mit.edu