On counterfactual inference with unobserved confounding

Speaker

Abhin Shah

CSAIL and LIDS

Host

Sharut Gupta

MIT CSAIL

Abstract: Given an observational study with n independent but heterogeneous units, our goal is to learn the counterfactual distribution for each unit using only one p-dimensional sample per unit containing covariates, interventions, and outcomes. Specifically, we allow for unobserved confounding that introduces statistical biases between interventions and outcomes as well as exacerbates the heterogeneity across units. Modeling the conditional distribution of the outcomes as an exponential family, we reduce learning the unit-level counterfactual distributions to learning n exponential family distributions with heterogeneous parameters and only one sample per distribution. We introduce a convex objective that pools all n samples to jointly learn all n parameter vectors, and provide a unit-wise mean squared error bound that scales linearly with the metric entropy of the parameter space. For example, when the parameters are s-sparse linear combination of k known vectors, the error is O(s log k/p). En route, we derive sufficient conditions for compactly supported distributions to satisfy the logarithmic Sobolev inequality. As an application of the framework, our results enable consistent imputation of sparsely missing unobserved confounders.

Speaker bio: Abhin Shah is a sixth-year Ph.D. student advised by Prof. Devavrat Shah and Prof. Greg Wornell. He is a recipient of MIT’s Jacobs Presidential Fellowship. His research interests include theoretical and applied aspects of trustworthy machine learning with a focus on causality and fairness.

Add to Calendar 2023-10-06 16:00:00 2023-10-06 16:30:00 America/New_York On counterfactual inference with unobserved confounding Abstract: Given an observational study with n independent but heterogeneous units, our goal is to learn the counterfactual distribution for each unit using only one p-dimensional sample per unit containing covariates, interventions, and outcomes. Specifically, we allow for unobserved confounding that introduces statistical biases between interventions and outcomes as well as exacerbates the heterogeneity across units. Modeling the conditional distribution of the outcomes as an exponential family, we reduce learning the unit-level counterfactual distributions to learning n exponential family distributions with heterogeneous parameters and only one sample per distribution. We introduce a convex objective that pools all n samples to jointly learn all n parameter vectors, and provide a unit-wise mean squared error bound that scales linearly with the metric entropy of the parameter space. For example, when the parameters are s-sparse linear combination of k known vectors, the error is O(s log k/p). En route, we derive sufficient conditions for compactly supported distributions to satisfy the logarithmic Sobolev inequality. As an application of the framework, our results enable consistent imputation of sparsely missing unobserved confounders.Speaker bio: Abhin Shah is a sixth-year Ph.D. student advised by Prof. Devavrat Shah and Prof. Greg Wornell. He is a recipient of MIT’s Jacobs Presidential Fellowship. His research interests include theoretical and applied aspects of trustworthy machine learning with a focus on causality and fairness. Room 32-370

Organizer & Contact

Sharut Gupta

sharut@mit.edu

Part of

ML Tea

On counterfactual inference with unobserved confounding

Speaker

Host

October 06 2023

Location

Organizer & Contact

Part of

April 28

ML Tea: Evaluating Multiple Models Using Labeled and Unlabeled Data

April 23

ML Tea: Do Large Language Model Benchmarks Test Reliability?

On counterfactual inference with unobserved confounding

Speaker

Host

October 06 2023

Location

Organizer & Contact

Part of

Related Events

April 28

ML Tea: Evaluating Multiple Models Using Labeled and Unlabeled Data

April 23

ML Tea: Do Large Language Model Benchmarks Test Reliability?