Causal Inference for Data Scientists: Econometrics to Machine Learning


Date
Event
Cakes & Tensors, Booz Allen Hamilton (Data Science Series)
Location
Virtual

Overview

Correlation does not imply causation. Standard machine learning and deep learning methods are powerful for prediction tasks but inappropriate for answering causal questions like policy impacts or drug treatment effects. In this talk, I provide a brief introduction to causal inference for observational data, present standard statistical/econometric methods for estimating causal effects, introduce recent advancements using machine learning for causal inference, and provide an overview of available Python libraries.

The talk can be broadly organized into three sections: (i) theory, (ii) models, and (iii) packages.

Theory

  • Randomized Controlled Trials
  • Treatment effects (ATE and CATE)
  • Simpson’s Paradox
  • Potential Outcomes Framework (also known as Rubin Causal Model)
  • Selection bias
  • Confounding effects

Models

Econometric

  • Differences-in-Differences
  • Instrumental Variables (2SLS estimation)
  • Regression Discontinuity Design
  • Propensity Score Matching

Machine Learning

  • Tree-based models (such as Causal Forests)
  • Meta-learners
  • Econometric extensions

Packages

  • Statsmodels
  • CausalML (Uber)
  • EconML (Microsoft Research)

Please refer to the slides for a more in-depth presentation:

Avatar
Ancil Crayton
Senior Research Scientist

My research interests lie at the intersection of machine learning, economic analysis, and public policy.