This community is interested in understanding and affecting the interaction between computing systems and society through engineering, computer science and public policy research, education, and public engagement.
Automatic speech recognition (ASR) has been a grand challenge machine learning problem for decades. Our ongoing research in this area examines the use of deep learning models for distant and noisy recording conditions, multilingual, and low-resource scenarios.
Data often has geometric structure which can enable better inference; this project aims to scale up geometry-aware techniques for use in machine learning settings with lots of data, so that this structure may be utilized in practice.
We develop algorithms, systems and software architectures for automating reconstruction of accurate representations of neural tissue structures, such as nanometer-scale neurons' morphology and synaptic connections in the mammalian cortex.
The Robot Compiler allows non-engineering users to rapidly fabricate customized robots, facilitating the proliferation of robots in everyday life. It thereby marks an important step towards the realization of personal robots that have captured imaginations for decades.
All humans process vast quantities of unannotated speech and manage to learn phonetic inventories, word boundaries, etc., and can use these abilities to acquire new word. Why can't ASR technology have similar capabilities? Our goal in this research project is to build speech technology using unannotated speech corpora.
The goal of this project is to develop and test a wearable ultrasonic echolocation aid for people who are blind and visually impaired. We combine concepts from engineering, acoustic physics, and neuroscience to make echolocation accessible as a research tool and mobility aid.
Our goal is to build a system that predicts where people are looking in images. Given an image and the location of a head, our approach follows the gaze of the person and identifies the object being looked at.
A new MIT study finds “health knowledge graphs,” which show relationships between symptoms and diseases and are intended to help with clinical diagnosis, can fall short for certain conditions and patient populations. The results also suggest ways to boost their performance.
Last week MIT’s Institute for Foundations of Data Science (MIFODS) held an interdisciplinary workshop aimed at tackling the underlying theory behind deep learning. Led by MIT professor Aleksander Madry, the event focused on a number of research discussions at the intersection of math, statistics, and theoretical computer science.
MIT’s Amar Gupta and his wife Poonam were on a trip to Los Angeles in 2016 when she fell and broke both wrists. She was whisked by ambulance to a reputable hospital. But staff informed the couple that they couldn’t treat her there, nor could they find another local hospital that would do so. In the end, the couple was forced to take the hospital’s stunning advice: return to Boston for treatment.
This week it was announced that MIT professors and CSAIL principal investigators Shafi Goldwasser, Silvio Micali, Ronald Rivest, and former MIT professor Adi Shamir won this year’s BBVA Foundation Frontiers of Knowledge Awards in the Information and Communication Technologies category for their work in cryptography.
Neural networks, which learn to perform computational tasks by analyzing huge sets of training data, have been responsible for the most impressive recent advances in artificial intelligence, including speech-recognition and automatic-translation systems.
Hyper-connectivity has changed the way we communicate, wait, and productively use our time. Even in a world of 5G wireless and “instant” messaging, there are countless moments throughout the day when we’re waiting for messages, texts, and Snapchats to refresh. But our frustrations with waiting a few extra seconds for our emails to push through doesn’t mean we have to simply stand by.
The butt of jokes as little as 10 years ago, automatic speech recognition is now on the verge of becoming people’s chief means of interacting with their principal computing devices. In anticipation of the age of voice-controlled electronics, MIT researchers have built a low-power chip specialized for automatic speech recognition. Whereas a cellphone running speech-recognition software might require about 1 watt of power, the new chip requires between 0.2 and 10 milliwatts, depending on the number of words it has to recognize.