UROP Research Opportunities

The Undergraduate Research Opportunities Program (UROP) cultivates and supports research partnerships between MIT undergraduates and faculty. If you have any questions please contact tluongo@csail.mit.edu or take a look at the How to UROP at CSAIL document (pdf format).  

This program is available to MIT students only.


  • Interactive Data Visualization for Everyone the Web

    Faculty Advisor: David Karger
    Contact e-mail: karger@mit.edu
    Research Area(s): Graphics and Human-Computer Interfaces
    Exhibit (Link ) is an open source Javascript library that helps non-programmers author and publish rich interactive data visualizations on the web. We use Exhibit to push the boundaries of web authoring without programming, with our ultimate goal being to enable end-users to WYSIWYG-author complete web applications. Exhibit has been adopted on over a thousand web sites by hobbyists, scientists, merchants, and journalists...

    Posted date: April 04, 2013
  • Interactive Data Visualization for Journalists using Wordpress

    Faculty Advisor: David Karger
    Contact e-mail: karger@mit.edu
    Research Area(s): Graphics and Human-Computer Interfaces
    There's a new movement in journalism to incorporate rich data visualization in news stories, but many journalists lack that skills to create their own "news apps" for this purpose. We've prototyped a data visualization framework, Datapress (Link ) to support authoring (not programming) such...

    Posted date: April 04, 2013
  • The Future Textbook

    Faculty Advisor: David Karger
    Contact e-mail: karger@mit.edu
    Research Area(s): Graphics and Human-Computer Interfaces
    Now that we can put textbooks on the web, how can we change them to make them better? How can we make them more dynamic, more adaptable to individual students, more sociable, or more informative? We've tackled some of these questions with Nb (Link ), tool that lets students hold forum-type discussions in the margins of their...

    Posted date: April 04, 2013
  • Transparent Web Browsing

    Faculty Advisor: David Karger
    Contact e-mail: karger@mit.edu
    Research Area(s): Graphics and Human-Computer Interfaces
    Nowadays, all sorts of shady companies are collecting information about your browsing activities and using it for their own mysterious purposes. How could that information be used to your benefit? We propose to build Eyebrowse, a web browser extension that gathers information about your web browsing activities and shares that information (under your control) with...

    Posted date: April 04, 2013
  • Grammatical structure in developmental protein/DNA binding

    Cells choose their identity as a result of combinatorial expression of proteins called transcription factors that bind to specific DNA sequences and turn on and off sets of genes. Our understanding of this cellular programming is rudimentary, but a more complete characterization could enable the conversion of one cell type into another with transformative therapeutic consequences. We have devised a machine learning technique that identifies the genomic binding location of a large number of transcription factors in a given cell state based on an experimental dataset called DNase-Seq, and we...

    Posted date: April 02, 2013
  • Mechanisms of master regulator hand-off during red blood cell development

    So-called master regulators are transcription factor proteins whose individual expression can effect a change in cell identity by either directly co-binding with other factors to target specific gene regulation sites, or through the activation of broad signaling pathways. However, what precipitates the transition from one master-regulated state to another is typically not as well understood. For example, during the developmental transition from hematopoietic stem cells to red blood cells the Gata2 master regulator gives way to Gata1, binding different genomic sites despite their...

    Posted date: April 02, 2013
  • Investigating Natural Language Tools for Artificial Intelligence.

    The Infolab seeks UROPs interested in investigating natural language tools for artificial intelligence. The Infolab works on question answering, parsing, generating, and more, using both symbolic and statistical techniques. Introductory projects range from integrating knowledge sources to expanding automated methods to creating user interfaces and APIs; continuing opportunities for more in-depth research are available.

    Contact: Boris Katz, boris@csail.mit.edu

    Posted date: March 28, 2013
  • Segmentation of organs at risk in Head and Neck CT scans

    Do you want to contribute to improving the life of patients with head and neck cancer? Help us to develop a better algorithm for the segmentation of organs at risk. The accurate segmentation allows designing radiotherapy treatment plans that expose organs at risk to low radiation dose, leading to improved quality of life after the treatment. The segmentation is performed on 3D computed tomography (CT) images. We apply machine learning techniques to assign labels to patches based on a repository of manually labeled images. Implementation is mainly done in MATLAB. Our main goal is to refine...

    Posted date: March 27, 2013
  • iDiary

    Imagine an automatic private diary that records your life. For example, it allows you to:

    - Manage your time and get statistics about the time you spent with specific friends, family, or places.

    - Search it for all the restaurants that you visited last year and send to your guest.

    - See where you celebrated every birthday of your life.

    - Publish parts of your auto autobiography to the world, and to your grandchildren in the future.

    Our group at DRL is developing solutions towards these goals
    based on collected data from smartphone sensors. We build...

    Posted date: March 26, 2013
  • Printable Robots

    The goal of this project is to build a variety of flexible robotic systems from scratch using planar fabrication techniques. These systems include a number of origami inspired foldable robots, and pneumatically actuated elastomeric soft robots. Made of flat plastic sheets, these robots carry their own custom flexible circuit boards. Our aim is to achieve general, easy, and simple techniques for printing functional machines, and demonstrate that a suite of devices can be created and programmed this way. Another goal is to enable wireless programming, communication, and remote control...

    Posted date: March 26, 2013
  • A development environment for mobile apps, education, and entrepreneurship

    The goal of this project is to build and test an integrated development environment where undergraduates can generate ideas for mobile applications, build prototypes, and refine these to the point where they could be the basis for launching new ventures. Students' initial design work will be done using App Inventor for Android, which enables rapid investigation of working prototype apps.

    One challenge in this project is to create extension mechanisms for App Inventor so that students can smoothly bridge from their initial prototyping work to more refined use of the Android SDK....

    Posted date: March 26, 2013
  • Project: To program a model of fly's visual tracking.

    We are looking for a UROP to program the simulated behavior of several artificial flies, interacting visually with each other. Each fly is described by a simple tracking system (Buelthoff, Poggio and Wehrhahn 1980; Wehrhahn, Poggio and Buelthoff 1982) which summarizes behavioral experiments in which individual real flies track and chase targets. The model for this behavior is suggested by M. Poggio and T. Poggio in their paper: Cooperative physics of fly swarms: an emergent behavior. A.I Memo No. 1512, C.B.C.L Paper No. 103 (1994). We expect the model to be programmed and hopefully also...

    Posted date: March 21, 2013
  • Video Magnification

    We are developing algorithms to manipulate temporal variations in videos, to reveal small imperceptible changes (check out this video, and read more about it here) as well as automatically remove distracting changes (see here).

    We are looking for a strong and motivated student to work closely with us on exploring potential applications. This includes trying out our methods with different kinds of data, such as medical images (fMRI), satellite imagery, seismic data and time-lapse videos, developing tools to facilitate the experiments, and potentially helping us tune and improve the...

    Posted date: March 14, 2013
  • Bringing Large Scale Machine learning service to the desktop

    Wouldn't it be exciting to be able to call on command line: classify(datafileLoc), or regress(dataFileLoc) and spin of 100s or more nodes on the cloud that access the data from the location specified at dataFileLoc and run machine learning and return results. We have built a large scale, cloud-based, machine learning system. This paper explains the latest version of our system. Our system is a collection of distributed compute units and its design...

    Posted date: February 12, 2013
  • Evolutionary Design and Optimization Group: Student Research Opportunities for Spring/Summer 2013

    Posted date: February 12, 2013
  • Mining a MOOC's activity data: 6.002X explored

    We are building a variety of machine learning algorithms for mining data generated while delivering educational content to hundreds and thousands of students all over the world. A very fundamental question that folks in education are attempting to answer is: "What worked?" Answering this question would require us to analyze data in novel ways, for example building models of students, balancing for confounding factors. We are looking for a talented UROP or MEng student to work with a Research Scientist and a group of scientists and fellows at the MIT EdX team. This project has possible...

    Posted date: February 12, 2013
  • Big Data+ Machine learning + Medicine + Volunteer compute: Could it get anymore exciting?

    Come join us and learn how we are building a large scale machine learning system through which we are attempting to solve some of the most challenging problems for our society. The most fascinating part of this is that we want to do this by using the the left over cpu cycles on machines all across the world. Technically, this creates a challenge for us as we are not able to centrally coordinate and plan data distribution, algorithmic steps and collect and process results. During the first two years we have made a lot of progress and are now seeking students to work with us in deploying...

    Posted date: February 12, 2013
  • BP-Watch: Predicting blood pressure in an ICU setting

    We are building a large scale predictive system that predicts the blood pressure for a patient under intensive care. The project relies on cloud-scale machine learning of many diverse predictive models. A variety of tasks are on the agenda including cloud-scale empirical experimentation, cross-referencing model predictions to clinical events, time series modeling, unsupervised learning of similar blood pressure segments and ultimately transforming many model outputs which are in the form of probabilities and predictions into visualizations that are succinct and informative to the doctors...

    Posted date: February 12, 2013
  • Feature Decision Boundaries and Quantization for Big Data Classification with ML

    When building a rule based classifier (aka decision list) that allows readability, the decision boundaries have a significant effect on the accuracy of the solutions. The goal of this project is to develop efficient methods and algorithms to identify decision boundaries for large feature sets. We are working with a large scale classification problem in the medical domain with possibly hundreds and thousands of variables, some of which are tightly correlated. Efficient methods to identify thresholds for decision boundaries is intractable. You will work with a team of researchers with strong...

    Posted date: February 12, 2013
  • FlexGP: Evaluating a Million models on a Billion cases

    Our FlexGP system currently generates thousands of non-linear models that are of the form y=f(x), where f(.) could be any mathematical function generated from a set of operators, log, sin, sqrt. For example an expression could be y=log(x1)+sin(x2). For big data problems we have to perform multiple passes through the data, each time applying the model to the data and measuring its accuracy, to identify the best set of non-linear expressions that best explain the data. In this regard we are investigating and developing methods in order to be able to evaluate a million models on a billion...

    Posted date: February 12, 2013