We are an interdisciplinary group of researchers blending approaches from human-computer interaction, social computing, databases, information management, and databases.
We investigate language in different contexts: from how it is learned, to how it is grounded in visual perception, all the way to how machines can readily interact with humans.
Led by Web inventor and Director, Tim Berners-Lee and CEO Jeff Jaffe, the W3C focus is on leading the World Wide Web to its full potential by developing standards, protocols and guidelines that ensure the long-term growth of the Web
Our mission is to work with policy makers and cybersecurity technologists to increase the trustworthiness and effectiveness of interconnected digital systems.
We build tools to allow a community of people to collectively summarize large discussions online and manage knowledge embedded within these discussions.
Wait-learning makes it easier for busy people to learn informally, by automatically detecting when they are waiting and delivering optional learning exercises that can be completed during wait time.
Our goal is to develop new applications and algorithms that leverage the skills of distributed crowdworkers, notably for image and video processing applications.
Mixed-methods qualitative (interviews and coding) and computational (AI) approach to understanding relationships between social identities, cultural values, and virtual identity technologies (e.g., online profiles and avatars).
This study aims to understand the behaviors, motivations, and gameplay types of videogame players through topic modelling, sentiment clusters, and valence analysis of videogame reviews.
We aim to study the impact of computer-supported roleplaying in changing social perspectives of digital media users. Such media could take the form of videogames, VR systems, training software, and other types of interactive narrative technology.
Uhura is an autonomous system that collaborates with humans in planning and executing complex tasks, especially under over-subscribed and risky situations.
What is design thinking in the context of software? We're developing a new approach to software design that achieves usability and robustness by going deeper than the user interface.
The goal of this project is to develop and test a wearable ultrasonic echolocation aid for people who are blind and visually impaired. We combine concepts from engineering, acoustic physics, and neuroscience to make echolocation accessible as a research tool and mobility aid.
Almost every object we use is developed with computer-aided design (CAD). While CAD programs are good for creating designs, using them to actually improve existing designs can be difficult and time-consuming.
CilkPride is a programming environment that aims to make performance and safety information always available and appropriately visible to the programmer.
Mavo is a language that lets anyone turn a static HTML document into a fully functioning reactive web application with data presentation, editing, storage and lightweight computation, all without writing a single line of Javascript or other programming code.
Our goal is to enable robots to understand and execute natural language commands from human agents. We develop algorithms that allow a robot to interpret, learn and reason about semantic concepts embedded in language in the context of low-level metric representations perceived from sensors.
In collaboration with mathematicians at Tufts University, we are studying how to establish fair, mathematically well-posed, and computationally tractable standards for political redistricting.
Our goal is to develop a framework for selecting between visual and haptic modalities for navigation systems used by operators in high-workload environments.
Our goal is to develop collaborative agents (software or robots) that can efficiently communicate with their human teammates. Key threads involve designing algorithms for inferring human behavior and for decision-making under uncertainty.
The computer as a medium offers a unique expressive palette for storytellers. With it, we can build and convey models of crucial, moving issues in our world. As a step toward this aim as it relates to sexism, we present our interactive narrative called Grayscale. The experience is intended to provoke players to reflect critically on sexism in the workplace, both overt & hostile and more subtle.
September 12, 2018 - Kate Starbird of the University of Washington gave a Hot Topics in Computing Lecture titled "Muddied Waters: Online Disinformation During Crisis Events."
Last week CSAIL hosted the fourth “Hot Topics in Computing” speaker series, a monthly forum where experts hold discussions with community members on various hot-button tech topics.
Neural networks, which learn to perform computational tasks by analyzing huge sets of training data, have been responsible for the most impressive recent advances in artificial intelligence, including speech-recognition and automatic-translation systems.
Communicating through computers has become an extension of our daily reality. But as speaking via screens has become commonplace, our exchanges are losing inflection, body language, and empathy. Danielle Olson ’14, a first-year PhD student at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), believes we can make digital information-sharing more natural and interpersonal, by creating immersive media to better understand each other’s feelings and backgrounds.
Most robots are programmed using one of two methods: learning from demonstration, in which they watch a task being done and then replicate it, or via motion-planning techniques such as optimization or sampling, which require a programmer to explicitly specify a task’s goals and constraints.
Hyper-connectivity has changed the way we communicate, wait, and productively use our time. Even in a world of 5G wireless and “instant” messaging, there are countless moments throughout the day when we’re waiting for messages, texts, and Snapchats to refresh. But our frustrations with waiting a few extra seconds for our emails to push through doesn’t mean we have to simply stand by.
The butt of jokes as little as 10 years ago, automatic speech recognition is now on the verge of becoming people’s chief means of interacting with their principal computing devices. In anticipation of the age of voice-controlled electronics, MIT researchers have built a low-power chip specialized for automatic speech recognition. Whereas a cellphone running speech-recognition software might require about 1 watt of power, the new chip requires between 0.2 and 10 milliwatts, depending on the number of words it has to recognize.
Speech recognition systems, such as those that convert speech to text on cellphones, are generally the result of machine learning. A computer pores through thousands or even millions of audio files and their transcriptions, and learns which acoustic features correspond to which typed words.But transcribing recordings is costly, time-consuming work, which has limited speech recognition to a small subset of languages spoken in wealthy nations.
For people struggling with obesity, logging calorie counts and other nutritional information at every meal is a proven way to lose weight. The technique does require consistency and accuracy, however, and when it fails, it’s usually because people don't have the time to find and record all the information they need.A few years ago, a team of nutritionists from Tufts University who had been experimenting with mobile-phone apps for recording caloric intake approached members of the Spoken Language Systems Group at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), with the idea of a spoken-language application that would make meal logging even easier.
Every language has its own collection of phonemes, or the basic phonetic units from which spoken words are composed. Depending on how you count, English has somewhere between 35 and 45. Knowing a language’s phonemes can make it much easier for automated systems to learn to interpret speech.In the 2015 volume of Transactions of the Association for Computational Linguistics, CSAIL researchers describe a new machine-learning system that, like several systems before it, can learn to distinguish spoken words. But unlike its predecessors, it can also learn to distinguish lower-level phonetic units, such as syllables and phonemes.
CSAIL’s Spoken Language Systems Group has unveiled a new technique for automatically tracking speakers in audio recordings. The new technique tackles the task of speaker diarization, or computationally determining how many speakers are present in a recording.
We are an interdisciplinary group of researchers blending approaches from human-computer interaction, social computing, databases, information management, and databases.
We investigate language in different contexts: from how it is learned, to how it is grounded in visual perception, all the way to how machines can readily interact with humans.
Our mission is to work with policy makers and cybersecurity technologists to increase the trustworthiness and effectiveness of interconnected digital systems.
Led by Web inventor and Director, Tim Berners-Lee and CEO Jeff Jaffe, the W3C focus is on leading the World Wide Web to its full potential by developing standards, protocols and guidelines that ensure the long-term growth of the Web
We aim to develop a systematic framework for robots to build models of the world and to use these to make effective and safe choices of actions to take in complex scenarios.
Alloy is a language for describing structures and a tool for exploring them. It has been used in a wide range of applications from finding holes in security mechanisms to designing telephone switching networks. Now one of the main challenges is to make the system more usable and understandable for the user.
Self-driving cars are likely to be safer, on average, than human-driven cars. But they may fail in new and catastrophic ways that a human driver could prevent. This project is designing a new architecture for a highly dependable self-driving car.
The Arabic language is spoken by over one billion people around the world. Arabic presents a variety of challenges for speech and language processing technologies. In our group, we have several research topics examining Arabic, including dialect identification, speech recognition, machine translation, and language processing.
Automatic speech recognition (ASR) has been a grand challenge machine learning problem for decades. Our ongoing research in this area examines the use of deep learning models for distant and noisy recording conditions, multilingual, and low-resource scenarios.
This project aims to let people correct robot mistakes with nothing more than their brain signals - to allow robots to adapt to humans rather than the other way around
We aim to study the impact of computer-supported roleplaying in changing social perspectives of digital media users. Such media could take the form of videogames, VR systems, training software, and other types of interactive narrative technology.
CilkPride is a programming environment that aims to make performance and safety information always available and appropriately visible to the programmer.
Knitting is the new 3d printing. It has become popular again with the widespread availability of patterns and templates, together with the maker movements. Lower-cost industrial knitting machines are starting to emerge, but we are still missing the corresponding design tools. Our goal is to fill this gap.
Our goal is to develop new applications and algorithms that leverage the skills of distributed crowdworkers, notably for image and video processing applications.
Mixed-methods qualitative (interviews and coding) and computational (AI) approach to understanding relationships between social identities, cultural values, and virtual identity technologies (e.g., online profiles and avatars).
The robot garden provides an aesthetically pleasing educational platform that can visualize computer science concepts and encourage young students to pursue programming and robotics.
Déjà Vu is a new platform for end-user development of apps with rich functionality. It features a novel theory of modularity for binding concepts; an extensive library of reusable concepts; and a WYSIWYG tool for specifying bindings and customizing visual layout
Our goal is to develop collaborative agents (software or robots) that can efficiently communicate with their human teammates. Key threads involve designing algorithms for inferring human behavior and for decision-making under uncertainty.
Almost every object we use is developed with computer-aided design (CAD). While CAD programs are good for creating designs, using them to actually improve existing designs can be difficult and time-consuming.
Our goal is to develop a framework for selecting between visual and haptic modalities for navigation systems used by operators in high-workload environments.
The creation of low-power circuits capable of speech recognition and speaker verification will enable spoken interaction on a wide variety of devices in the era of Internet of Things.
Mavo is a language that lets anyone turn a static HTML document into a fully functioning reactive web application with data presentation, editing, storage and lightweight computation, all without writing a single line of Javascript or other programming code.
This study aims to understand the behaviors, motivations, and gameplay types of videogame players through topic modelling, sentiment clusters, and valence analysis of videogame reviews.
September 12, 2018 - Kate Starbird of the University of Washington gave a Hot Topics in Computing Lecture titled "Muddied Waters: Online Disinformation During Crisis Events."
Last week CSAIL hosted the fourth “Hot Topics in Computing” speaker series, a monthly forum where experts hold discussions with community members on various hot-button tech topics.
Neural networks, which learn to perform computational tasks by analyzing huge sets of training data, have been responsible for the most impressive recent advances in artificial intelligence, including speech-recognition and automatic-translation systems.
Communicating through computers has become an extension of our daily reality. But as speaking via screens has become commonplace, our exchanges are losing inflection, body language, and empathy. Danielle Olson ’14, a first-year PhD student at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), believes we can make digital information-sharing more natural and interpersonal, by creating immersive media to better understand each other’s feelings and backgrounds.
Most robots are programmed using one of two methods: learning from demonstration, in which they watch a task being done and then replicate it, or via motion-planning techniques such as optimization or sampling, which require a programmer to explicitly specify a task’s goals and constraints.
Hyper-connectivity has changed the way we communicate, wait, and productively use our time. Even in a world of 5G wireless and “instant” messaging, there are countless moments throughout the day when we’re waiting for messages, texts, and Snapchats to refresh. But our frustrations with waiting a few extra seconds for our emails to push through doesn’t mean we have to simply stand by.
The butt of jokes as little as 10 years ago, automatic speech recognition is now on the verge of becoming people’s chief means of interacting with their principal computing devices. In anticipation of the age of voice-controlled electronics, MIT researchers have built a low-power chip specialized for automatic speech recognition. Whereas a cellphone running speech-recognition software might require about 1 watt of power, the new chip requires between 0.2 and 10 milliwatts, depending on the number of words it has to recognize.
Speech recognition systems, such as those that convert speech to text on cellphones, are generally the result of machine learning. A computer pores through thousands or even millions of audio files and their transcriptions, and learns which acoustic features correspond to which typed words.But transcribing recordings is costly, time-consuming work, which has limited speech recognition to a small subset of languages spoken in wealthy nations.
For people struggling with obesity, logging calorie counts and other nutritional information at every meal is a proven way to lose weight. The technique does require consistency and accuracy, however, and when it fails, it’s usually because people don't have the time to find and record all the information they need.A few years ago, a team of nutritionists from Tufts University who had been experimenting with mobile-phone apps for recording caloric intake approached members of the Spoken Language Systems Group at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), with the idea of a spoken-language application that would make meal logging even easier.
Every language has its own collection of phonemes, or the basic phonetic units from which spoken words are composed. Depending on how you count, English has somewhere between 35 and 45. Knowing a language’s phonemes can make it much easier for automated systems to learn to interpret speech.In the 2015 volume of Transactions of the Association for Computational Linguistics, CSAIL researchers describe a new machine-learning system that, like several systems before it, can learn to distinguish spoken words. But unlike its predecessors, it can also learn to distinguish lower-level phonetic units, such as syllables and phonemes.
CSAIL’s Spoken Language Systems Group has unveiled a new technique for automatically tracking speakers in audio recordings. The new technique tackles the task of speaker diarization, or computationally determining how many speakers are present in a recording.