Learning causal variables with machine learning
Host
Divya Shanmugam
Science and decision-making require us to infer the effects of interventions. Does knocking out a given gene suppress a function of interest? Does a proposed tax actually change some behavior of interest? Causal models provide a language to model interventions, and help us derive assumptions that yield valid causal inference. Despite the role causality plays in the sciences, the applications of causal inference have been limited, often restricted to questions where all the variables are carefully measured. In contrast, the field of machine learning (ML) has arguably succeeded at extracting task-relevant information from unstructured inputs such as text and images, inputs that implicitly capture abstract variables. Nevertheless, variables inferred using ML may not be substitutes for the underlying but unknown causal variables: ML methods may entangle the underlying causal variables, or neglect to capture them, biasing downstream causal inference. In this talk, I'll discuss two approaches to learning causally relevant variables. First, I'll introduce causally sufficient text embeddings, a general method that leverages causal model structure to learn causal variables from text data. Next, I'll discuss recent work, inspired by biological tasks, that exploits evolution in the causal mechanism mapping inputs to a target of interest to learn causal variables. Finally, I'll conclude by highlighting ongoing and open research to address the challenges of causal reasoning with ML.
Speaker Bio:
Dhanya Sridhar is an assistant professor in the department of computer science and operations research at Université de Montréal, a core academic member of Mila, and a Canada CIFAR AI Chair. Previously, she was a postdoctoral researcher at Columbia University. She received her doctorate from the University of California, Santa Cruz. Her research is at the intersection of causality and machine learning, using large-scale data to study causal questions in science with the help of ML methods.
This talk is a part of the Applied ML CoR seminar series, and will be followed by the Causal ML reading group meeting. Lunch will be provided at 12.
Speaker Bio:
Dhanya Sridhar is an assistant professor in the department of computer science and operations research at Université de Montréal, a core academic member of Mila, and a Canada CIFAR AI Chair. Previously, she was a postdoctoral researcher at Columbia University. She received her doctorate from the University of California, Santa Cruz. Her research is at the intersection of causality and machine learning, using large-scale data to study causal questions in science with the help of ML methods.
This talk is a part of the Applied ML CoR seminar series, and will be followed by the Causal ML reading group meeting. Lunch will be provided at 12.