Project

Interpretability in complex machine learning models

Our goal is to develop methods that can "explain" the behavior of complex machine learning models, without restricting their power. We seek explanations that are simple, robust and grounded in statistical analysis of the model's behavior.

Modern machine learning models are highly flexible but lack transparency. Can we devise methods to explain the predictions of such models, without restricting their expressiveness? Can we do so even if we don't know anything about their architecture, i.e., if they are "black-boxes"? In this project, we are developing methods for explaining the predictions made rather than constraining the models themselves to be interpretable. We are particularly interested in providing explanations for the predictions of complex machine learning models that operate on structured data, such as sentences, trees or graphs. For example, we use statistical input-output analysis to learn to interpret predictions of sequence-to-sequence models, such as those used in machine translation and dialogue systems.

Group

Machine Learning

Communities

Vertical AI Community of Research

Contact us

If you would like to contact us about our work, please refer to our members below and reach out to one of the group leads directly.

Last updated Apr 24 '20

Project

Interpretability in complex machine learning models

Group

Communities

Contact us

Group

Communities

Members

Tommi Jaakkola