Understanding Language Representations in Deep Learning Models
Our goal is to explore language representations in computational models. We develop new models for representing natural language and investigate how existing models learn language, focusing on neural network models in key tasks like machine translation and speech recognition.
Representing language is a key problem in developing human language technologies. In recent years, vector representations of words have gained renewed popularity thanks to advances in developing efficient methods for inducing high quality representations from large amounts of raw text. Empirically, neural vector representations have been successfully applied in diverse tasks in language processing and understanding. While neural networks achieve state-of-the-art results on many tasks, the vector representations that they learn are opaque and it is not clear what kind of information they capture. We aim to illuminate the black-box of language representations in deep learning models. Our key question is this: what kind of information do the learned representations capture? We answer it by alternating between developing new models and measuring the information that the learned representations capture. We have been focusing on end-to-end models for machine translation and speech recognition, and study how they capture different language properties.