Bringing Large Scale Machine learning service to the desktop

Wouldn't it be exciting to be able to call on command line: classify(datafileLoc), or regress(dataFileLoc) and spin of 100s or more nodes on the cloud that access the data from the location specified at dataFileLoc and run machine learning and return results. We have built a large scale, cloud-based, machine learning system. This paper explains the latest version of our system. Our system is a collection of distributed compute units and its design allows it to elastically shrink and expand as compute units are added or removed. In this project we are looking at two issues: first is to develop interfaces that can monitor learning progress and system status as the computation progresses and second provides desktop based simple access to the system.

Progress monitoring especially becomes challenging when we execute the algorithm on 300-1000 cores. This project aims to develop a variety of distributed techniques to aggregate information about the progress from different nodes and create efficient, elastic visual interfaces (think of zooming in or zooming out of a model of the system and at each zoom level obtaining appropriately abstracted relevant information) which would allow the user to see the progress of the computation and system configuration in order to make decisions about which nodes could be eliminated or added. The desktop based access allows user to specify machine learning commands like classify(data) or regress(data) and spin off a large scale system as this. This project has a team of graduate students and a postdoc. You will be working in a team with a lot of experience in this system making it all the more fun to enable the team to visualize what they are building.

MEng, Juniors and Seniors looking to lead to MEng via 6.UAT, UAP
Background: Course 6 courses in software and machine learning knowledge (6.034 and 6.867) 
Please contact: or