Feature Decision Boundaries and Quantization for Big Data Classification with ML

When building a rule based classifier (aka decision list) that allows readability, the decision boundaries have a significant effect on the accuracy of the solutions. The goal of this project is to develop efficient methods and algorithms to identify decision boundaries for large feature sets. We are working with a large scale classification problem in the medical domain with possibly hundreds and thousands of variables, some of which are tightly correlated. Efficient methods to identify thresholds for decision boundaries is intractable. You will work with a team of researchers with strong experience in this area. This is an exciting project where you will learn interpretable machine learning, big data implications for some traditional algorithms and develop methods that could have significant impact in the medical domain. Your project will be closely integrated into team effort where the team consists of a post-doc, and graduate students.

MEng, Juniors and Seniors looking to lead to MEng via 6.UAT, UAP
Background: Course 6 courses in software and machine learning knowledge (6.034 and 6.867) 
Please contact: hembergerik@csail.mit.edu, kalyan@csail.mit.edu or unamay@csail.mit.edu