Modern machine learning models are highly flexible but lack transparency. Can we devise methods to explain the predictions of such models, without restricting their expressiveness? Can we do so even if we don't know anything about their architecture, i.e., if they are "black-boxes"? In this project, we are developing methods for explaining the predictions made rather than constraining the models themselves to be interpretable. We are particularly interested in providing explanations for the predictions of complex machine learning models that operate on structured data, such as sentences, trees or graphs. For example, we use statistical input-output analysis to learn to interpret predictions of sequence-to-sequence models, such as those used in machine translation and dialogue systems.
If you would like to contact us about our work, please scroll down to the people section and click on one of the group leads' people pages, where you can reach out to them directly.