UROP Research Opportunities
The Undergraduate Research Opportunities Program (UROP) cultivates and supports research partnerships between MIT undergraduates and faculty. If you have any questions please contact firstname.lastname@example.org or take a look at the How to UROP at CSAIL document (pdf format).
This program is available to MIT students only.
Our FlexGP system currently generates thousands of non-linear models that are of the form y=f(x), where f(.) could be any mathematical function generated from a set of operators, log, sin, sqrt. For example an expression could be y=log(x1)+sin(x2). For big data problems we have to perform multiple passes through the data, each time applying the model to the data and measuring its accuracy, to identify the best set of non-linear expressions that best explain the data. In this regard we are investigating and developing methods in order to be able to evaluate a million models on a billion...Posted date: February 12, 2013
Machine learning systems depend on parameters and sometimes, buried deep inside, some randomized initial state. So, when we run them on BigData with different parameterizations, how can we unify the results? As well, how can we interpret an ML system's rules, classifiers, or models to learn how to iterate with an updated question for the ML system so we get better accuracy? Is the algorithm having trouble predicting a certain class? Why? Is it because of class imbalance or inadequate discriminatory power of a feature? Should we adjust the objective function to address these issues? Are the...Posted date: February 12, 2013
We are developing a prediction system that predicts rare events like hypotensive episodes in an ICU setting. We have assembled a large arterial blood pressure feature-level dataset from a publicly available waveform dataset. One of the challenges is that the balance of the classes in the data is extremely skewed due to the rare nature of the events we are interested in. This imbalance in the data can significantly impact the accuracy of the forecast and it especially affects the dynamics of our iterative learning engine. The goal of the project is to develop and identify an efficient...Posted date: February 12, 2013
When dealing with big data we generate thousands of models where each model specializes on a subset of the data. Once we generate these thousands of models we are developing techniques that are able to combine these multiple models by learning weights for fusing their predictions. The techniques range from simple average to weighted sum to probabilistic approaches. Known as ensemble learning these methods have been able to allow users to reach prediction accuracies higher than one single model.
See for example...Posted date: February 12, 2013
Reviewing past and current literature is a key scientific and engineering activity. Quantitative analysis of the language, topics, keywords, or citation graphs of any given subset of scientific literature (bibliometrics) can be a great help to understanding what has been done in a field and what the important next steps are. Unfortunately, off-the-shelf support for automatic extraction of citation graphs and analysis of those graphs using natural language processing and machine learning is still relatively limited. This UROP project will aim to advance the state of the art of...Posted date: February 04, 2013
For several decades neurologists have used a deceptively simple test to help diagnose cognitive capabilities: patients are asked to draw a clock face showing a particular time. New technology -- a ballpoint pen that digitizes as you write -- has made it possible to collect data from this test that is hundreds of times more precise than anything that can be discerned from ink on paper, as well as enabling virtually instant analysis of the data. Our current software makes several hundred measurements on each test, enabling new avenues of diagnosis for a range of diseases.
A multi-...Posted date: January 18, 2013
We are looking for strong Python programmers interested in contributing to a core learning architecture, which is going to set the standard internationally for reinforcement learning research. Through the process you will learn how to formulate many sequential decision making problems such as balancing an inverted pendulum, as Markov Decision Processes (MDPs) and how can you solve such problems.
Following skills are required:
- Object Oriented Programming in Python
- Linear Algebra
Following skills are highly desired:
-...Posted date: December 19, 2012
At CSAIL’s Decentralized Information Group, we think about information on the Web: Where it comes from, what happens to it, and what are the rules for using it. We’ve all seen stories about what people can learn about you from social networks, and the good and bad consequences of that. How can we promote good impacts of that, while minimizing risks and harmful effects? At DIG, we take the perspective that data on the Web should travel together with additional information that says where it comes from (provenance and context) and how it should be used (policy). We build to help...Posted date: December 11, 2012
In December 2010, Google introduced App Inventor for Android, a visual programming environment that makes it easy to create apps for Android phones. Prof. Abelson, who worked on App Inventor during his sabbatical last year, is planning to include the system in a new course this fall. He’s looking for help in developing good examples and creating extensions to the system, maybe even teaching summer workshops for kids. To work on this project, you should have some experience with Python (6.00 or 6.01) and an interest in mobile apps and educational technology. For more information see...Posted date: December 11, 2012
We are developing new authoring software for video lectures in the “virtual white board” style popularized by the Khan academy. Many UROP topics are available to help with this endeavor.
Contact: Fredo Durand email@example.comPosted date: December 04, 2012
The goal of this UROP is to augment the Online Python Tutor http://pythontutor.com/ with visualizations of the flow and variable changes of a program over time. This is partially inspired by Brett Victor’s learnable programming essay http://worrydream.com/LearnableProgramming/ .
Contact: Fredo Durand firstname.lastname@example.orgPosted date: December 04, 2012
We are developing means to automatically assist analysts/experts to
identify patterns and detect anomalies in big data streams as they arise
when heterogeneous, unstructured data sources are consulted. Our
approach solely relies upon analysts and their ability to group/categorize "common situations" into patterns. As one can imagine, an analyst can only process a subset of the big data. We are developing machine learning
algorithms that will use these partial groupings for a subset of the big data from each analyst and knit together, i.e....Posted date: December 04, 2012
What is Julia? See the BLOG: http://julialang.org/blog/2012/02/why-we-created-julia/ which answers why we created Julia. In short, because we are greedy. We are power Matlab users. Some of us are Lisp hackers. Some are Pythonistas, others Rubyists, still others Perl hackers. There are those of us who used Mathematica before we could grow facial hair. There are those who still can’t grow facial hair. We’ve generated more R plots than any sane...Posted date: August 07, 2012
The MIT Model-based Embedded & Robotic Systems (MERS) group is performing research into controlling autonomous systems using high-level, model-based and task directed languages. To that end, we are developing the Reactive Model-based Programming Language (RMPL). RMPL unifies autonomous plant description, control programs, temporal constraints, and more into a single language. We are looking for UROP students passionate about making programming autonomous systems easier and more accessible. The specific tasks for a UROP depend on the student's interests. Possibilities include: adding...Posted date: August 07, 2012
The MIT Center for Mobile Learning, and we have several
UROP slots for work in educational computing and mobility. The current
opportunities involve App Inventor, a programming tool that lets
anyone, create their own apps for Android smartphones. We're run
a major public cloud-based IDE that lets people all over the world
create their own mobile applications and lets schools all over the
world include mobile app programming in their middle and high school
UROP projects can range anywhere from developing new curriculum and...Posted date: July 06, 2012
Mathematical proofs are a core concept in many areas of computer science and other fields, but how to write them is not especially easy to learn. The rules of the game, determining which proof steps are convincing and which are cop-outs, can seem mysterious. In the computer theorem proving world, there are some venerable algorithmic definitions of proof validity, but they have rarely been applied in contexts with non-expert proof authors. The UROP opportunity I'm proposing would join existing work to build tools for insertion of machine-checked theorem proving in undergraduate courses,...Posted date: July 06, 2012
We are currently developing a C++ software system for experimentation with new designs in glass blowing, to explore the design space before actually making something. Our current focus is to develop a system for glass "cane", and to develop new artistic designs of cane, which has been relatively stagnant field for several years. The project currently involves computer graphics (OpenGL, meshing), geometry (for glass-blowing transformations), Qt user interface design (a kind of visual programming language for glassblowing), and multithreading (for background...Posted date: June 25, 2012
We want to automatically evaluate the correctness of image processing code in the context of online courses. Steudents will submit python code, which we will run on new input images. The main interesting challenge is that multiple answers might be "correct" from the perspective of the course and we need to develop comparison tools that take into account this set of possibilities.
Contact: Fredo Durand,email@example.comPosted date: April 23, 2012
Program optimization with linkage estimation (POLE), employs a
Bayesian network as the probabilistic model and is based on a
prototype-tree. A prototype tree is a single complete n-ary tree (where n is the maximum arity of the non-terminals) that represents a program. Each node in the tree is mapped to a random variable with its support corresponding to the discrete values for the choice set it can take. Programs are formed by sampling from this probabilistic model. More efficient programs are then used to re-estimate the distribution and then re-sample. Each iteration of this...Posted date: April 19, 2012
Large databases of labeled images are a key ingredient in building
large-scale object recognition systems and for this purpose MIT’s
computer vision group has produced LabelMe, a web-based image
annotation tool and online repository. LabelMe has helped shape the
frontier of object recognition research and we feel it is time to
start thinking about annotation and recognition beyond the desktop.
We are primarily seeking a highly motivated UROP candidate to help us
extend the LabelMe web-based interface to the iPhone, which will let
users...Posted date: March 29, 2012