UROP Research Opportunities

The Undergraduate Research Opportunities Program (UROP) cultivates and supports research partnerships between MIT undergraduates and faculty. If you have any questions please contact tluongo@csail.mit.edu or take a look at the How to UROP at CSAIL document (pdf format).  

This program is available to MIT students only.


  • FlexGP: Evaluating a Million models on a Billion cases

    Our FlexGP system currently generates thousands of non-linear models that are of the form y=f(x), where f(.) could be any mathematical function generated from a set of operators, log, sin, sqrt. For example an expression could be y=log(x1)+sin(x2). For big data problems we have to perform multiple passes through the data, each time applying the model to the data and measuring its accuracy, to identify the best set of non-linear expressions that best explain the data. In this regard we are investigating and developing methods in order to be able to evaluate a million models on a billion...

    Posted date: February 12, 2013
  • Machine Learning and Big Data: Performance Analytics

    Machine learning systems depend on parameters and sometimes, buried deep inside, some randomized initial state. So, when we run them on BigData with different parameterizations, how can we unify the results? As well, how can we interpret an ML system's rules, classifiers, or models to learn how to iterate with an updated question for the ML system so we get better accuracy? Is the algorithm having trouble predicting a certain class? Why? Is it because of class imbalance or inadequate discriminatory power of a feature? Should we adjust the objective function to address these issues? Are the...

    Posted date: February 12, 2013
  • Predicting "Rare" events in an ICU

    We are developing a prediction system that predicts rare events like hypotensive episodes in an ICU setting. We have assembled a large arterial blood pressure feature-level dataset from a publicly available waveform dataset. One of the challenges is that the balance of the classes in the data is extremely skewed due to the rare nature of the events we are interested in. This imbalance in the data can significantly impact the accuracy of the forecast and it especially affects the dynamics of our iterative learning engine. The goal of the project is to develop and identify an efficient...

    Posted date: February 12, 2013
  • Scalable methods for fusing Multiple models generated for big data

    When dealing with big data we generate thousands of models where each model specializes on a subset of the data. Once we generate these thousands of models we are developing techniques that are able to combine these multiple models by learning weights for fusing their predictions. The techniques range from simple average to weighted sum to probabilistic approaches. Known as ensemble learning these methods have been able to allow users to reach prediction accuracies higher than one single model.

    See for example...

    Posted date: February 12, 2013
  • Bibliometrics using Machine Learning and Natural Language Processing

    Reviewing past and current literature is a key scientific and engineering activity.  Quantitative analysis of the language, topics, keywords, or citation graphs of any given subset of scientific literature (bibliometrics) can be a great help to understanding what has been done in a field and what the important next steps are. Unfortunately, off-the-shelf support for automatic extraction of citation graphs and analysis of those graphs using natural language processing and machine learning is still relatively limited.  This UROP project will aim to advance the state of the art of...

    Posted date: February 04, 2013
  • Understanding the Human Learning Process Using AI Techniques


    We are looking for strong Python programmers interested in contributing to a core learning architecture, which is going to set the standard internationally for reinforcement learning research. Through the process you will learn how to formulate many sequential decision making problems such as balancing an inverted pendulum, as Markov Decision Processes (MDPs) and how can you solve such problems.
    Following skills are required:
    - Object Oriented Programming in Python
    - Linear Algebra

    Following skills are highly desired:

    Posted date: December 19, 2012
  • Policies for personal information on the Web

    At CSAIL’s Decentralized Information Group, we think about information on the Web: Where it comes from, what happens to it, and what are the rules for using it. We’ve all seen stories about what people can learn about you from social networks, and the good and bad consequences of that. How can we promote good impacts of that, while minimizing risks and harmful effects? At DIG, we take the perspective that data on the Web should travel together with additional information that says where it comes from (provenance and context) and how it should be used (policy). We build to help...

    Posted date: December 11, 2012
  • Learning computing by building mobile apps

    In December 2010, Google introduced App Inventor for Android, a visual programming environment that makes it easy to create apps for Android phones. Prof. Abelson, who worked on App Inventor during his sabbatical last year, is planning to include the system in a new course this fall. He’s looking for help in developing good examples and creating extensions to the system, maybe even teaching summer workshops for kids. To work on this project, you should have some experience with Python (6.00 or 6.01) and an interest in mobile apps and educational technology. For more information see...

    Posted date: December 11, 2012
  • Authoring of Online Video Lectures

    We are developing new authoring software for video lectures in the “virtual white board” style popularized by the Khan academy. Many UROP topics are available to help with this endeavor.

    Contact: Fredo Durand fredo@mit.edu

    Posted date: December 04, 2012
  • Visualizations for the Online Python Tutor

    The goal of this UROP is to augment the Online Python Tutor http://pythontutor.com/ with visualizations of the flow and variable changes of a program over time. This is partially inspired by Brett Victor’s learnable programming essay http://worrydream.com/LearnableProgramming/ .

    Contact: Fredo Durand fredo@mit.edu

    Posted date: December 04, 2012
  • Knit: Integrating Human Based Partial Analyses of Big Data

    We are developing means to automatically assist analysts/experts to
    identify patterns and detect anomalies in big data streams as they arise
    when heterogeneous, unstructured data sources are consulted. Our
    approach solely relies upon analysts and their ability to group/categorize "common situations" into patterns. As one can imagine, an analyst can only process a subset of the big data. We are developing machine learning
    algorithms that will use these partial groupings for a subset of the big data from each analyst and knit together, i.e....

    Posted date: December 04, 2012
  • Cloud Computing for Mathematics, Science, and Engineering – The Julia Project

    What is Julia? See the BLOG: http://julialang.org/blog/2012/02/why-we-created-julia/ which answers why we created Julia. In short, because we are greedy. We are power Matlab users. Some of us are Lisp hackers. Some are Pythonistas, others Rubyists, still others Perl hackers. There are those of us who used Mathematica before we could grow facial hair. There are those who still can’t grow facial hair. We’ve generated more R plots than any sane...

    Posted date: August 07, 2012
  • Developing a Reactive Model-based Programming Language

    The MIT Model-based Embedded & Robotic Systems (MERS) group is performing research into controlling autonomous systems using high-level, model-based and task directed languages. To that end, we are developing the Reactive Model-based Programming Language (RMPL). RMPL unifies autonomous plant description, control programs, temporal constraints, and more into a single language. We are looking for UROP students passionate about making programming autonomous systems easier and more accessible. The specific tasks for a UROP depend on the student's interests. Possibilities include: adding...

    Posted date: August 07, 2012
  • Educational Computing and Mobile Apps

    The MIT Center for Mobile Learning, and we have several
    UROP slots for work in educational computing and mobility. The current
    opportunities involve App Inventor, a programming tool that lets
    anyone, create their own apps for Android smartphones. We're run
    a major public cloud-based IDE that lets people all over the world
    create their own mobile applications and lets schools all over the
    world include mobile app programming in their middle and high school

    UROP projects can range anywhere from developing new curriculum and...

    Posted date: July 06, 2012
  • Computational Glass Blowing

    We are currently developing a C++ software system for experimentation with new designs in glass blowing, to explore the design space before actually making something.  Our current focus is to develop a system for glass "cane", and to develop new artistic designs of cane, which has been relatively stagnant field for several years.  The project currently involves computer graphics (OpenGL, meshing), geometry (for glass-blowing transformations), Qt user interface design (a kind of visual programming language for glassblowing), and multithreading (for background...

    Posted date: June 25, 2012
  • Automatic evaluation of image processing algorithms.

    We want to automatically evaluate the correctness of image processing code in the context of online courses. Steudents will submit python code, which we will run on new input images. The main interesting challenge is that multiple answers might be "correct" from the perspective of the course and we need to develop comparison tools that take into account this set of possibilities.

    Contact: Fredo Durand,fredo@csail.mit.edu

    Posted date: April 23, 2012
  • Probabilistic techniques for program generation

    Program optimization with linkage estimation (POLE), employs a
    Bayesian network as the probabilistic model and is based on a
    prototype-tree. A prototype tree is a single complete n-ary tree (where n is the maximum arity of the non-terminals) that represents a program. Each node in the tree is mapped to a random variable with its support corresponding to the discrete values for the choice set it can take. Programs are formed by sampling from this probabilistic model. More efficient programs are then used to re-estimate the distribution and then re-sample. Each iteration of this...

    Posted date: April 19, 2012
  • LabelMe-iPhone: Hand-held Visual Object Tagging and Recognition

    Large databases of labeled images are a key ingredient in building
    large-scale object recognition systems and for this purpose MIT’s
    computer vision group has produced LabelMe, a web-based image
    annotation tool and online repository. LabelMe has helped shape the
    frontier of object recognition research and we feel it is time to
    start thinking about annotation and recognition beyond the desktop.
    We are primarily seeking a highly motivated UROP candidate to help us
    extend the LabelMe web-based interface to the iPhone, which will let

    Posted date: March 29, 2012

    We are building a new programming language for image processing and
    computational photography which compiles clean algorithm descriptions
    to very high performance implementations on mobile devices. You will
    build an image editing app (or apps) for iOS, along the lines of
    SnapSeed and Adobe Revel, using our language to implement processing
    routines. The app will be a key demo of our technology, and can be
    distributed in the app store. Experience with iOS development a plus.

    Contact: Jonathan Ragan-Kelley and Fredo Durand

    Posted date: March 28, 2012
  • Google your Life

    Imagine an automatic private diary that records your life.
    For example, it allows you to:
    - Manage your time and get statistics about the time you spent with specific friends, family, or places.
    - Search it for all the restaurants that you visited last year and send to your guest.
    - See where you celebrated every birthday of your life.
    - Publish parts of your auto autobiography to the world, and to your grandchildren in the future.

    Our group at DRL is developing solutions towards these goals
    based on collected data from smartphone sensors. We...

    Posted date: March 28, 2012