CSAIL Event Calendar: Previous Series
Automatically Learning the Structure of Spoken Language Without Supervision
Speaker: Aren Jensen , Human Language Technology Center of Excellence and Center for Language and Speech Processing, Johns
Relevant URL: http://cbcl.mit.edu
Abstract: The dominant paradigm in the speech recognition community for the past four decades has been to train automatic systems with as much transcribed data we can get our greedy hands on. This strategy has led to the development of highly accurate systems that have finally found a place in our daily lives in the form of popular applications such as Apple iPhone's Siri. An unfortunate consequence of this trajectory, however, is that state-of-the-art recognition performance can only be achieved on languages and domains for which vast transcribed training resources either exist or can be easily obtained. Meanwhile, with public internet resources like YouTube and PodCasts, untranscribed speech audio is abundant and contains a wealth of hidden information regarding the acoustic-phonetic, lexical, grammatical, and semantic structure of the language being spoken. The trick is uncovering this structure automatically, an endeavor that will require new machine learning techniques, algorithms scalable to massive problem sizes, and a lot of patience. I will provide an overview of my efforts in these directions and describe some useful language- and domain-independent technologies that have been produced along the way.