Watch a short overview of the course and what you’ll learn.
This 25-hour course is the second part of the series on causal inference with linear regression. It is designed for people who want to move beyond simplified examples and understand technically how causal analyses are carried out in practice, where the data, the modelling decisions, and the project setup are often more complex.
The course covers the full workflow of an applied causal analysis. This includes translating a vague business or research question into a well-defined causal design, constructing a causal graph, and mapping that graph to an adjustment strategy that is feasible given the data that are actually available. From there, the course moves to the estimation stage, where the focus is on causal inference using modern linear regression approaches.
More specifically, for estimation we work with linear regression models that include flexible feature engineering, as well as double machine learning methods based on LASSO and OLS. These methods are used to estimate both average and conditional average treatment effects, and are discussed from a conceptual perspective as well as in terms of their practical implementation, including demonstrations using the DoubleML package in Python.
The entire workflow is illustrated through a realistic end-to-end case study. In this case study, we consider an e-commerce setting in which we want to understand how opting into a customer programme affects the profit generated by a customer over the next sixty days. This is not a simplified example. The causal graph becomes relatively large, the adjustment strategy is non-trivial, the data include many covariates, and the modelling choices require care. The aim is to make applied causal analysis tangible in a setting that reflects real-world complexity.
Throughout the course, you will work through coding exercises, so that by the end you will have carried out a complete causal analysis yourself, from study design to estimation.
Rather than focusing on isolated techniques, the course follows a full causal analysis pipeline:
Vague business question → well-defined study design → causal graph → adjustment strategy → positivity diagnostics and handling violations → specification and fitting of the estimation model → interpretation and use of the resulting estimates.
The course focuses on issues and considerations that are important for applied work but are often not or only briefly addressed in educational material on causal inference. This includes, for example, time indexing in causal graphs, the use of aggregated and cluster variables, the appearance of cycles, the limitations of standard LASSO approaches for causal estimation, and the practical implications of positivity violations.
The course is built around a single e-commerce case study that is intentionally not simplified. The causal graph and dataset contain a large number of variables, the ideal adjustment set is not fully observed, and the specification of the estimation model involves uncertainty.
The course makes heavy use of simulations, coding demonstrations and exercises. The goal is to make sure that everything we discuss conceptually also becomes tangible and applicable in practice.
The course does not restrict itself to binary treatments. The methods discussed are also applicable to continuous treatments (and, in some cases, multicategorical treatments).
By the end of the course, you will be able to carry out a regression-based causal analysis from start to finish in a structured and technically precise way.
Translate a vague business or research question into a well-defined causal study using target trial emulation
Incorporate practical considerations related to consistency and interference when defining the analysis
Understand the role of time indexing, aggregates, clusters, and cycles in causal graphs
Use these ideas to construct more realistic causal graphs in practice
Map a causal graph to an adjustment strategy that is feasible given the data that are actually available through proxy adjustment
Understand the practical implications of violations of the positivity assumption at a detailed level
Diagnose positivity problems and reason about coping strategies for binary, multicategorical, and continuous treatments
Understand the limitations of simple OLS models for causal analysis
Use feature engineering to make linear regression models more suitable for real-world causal problems
Understand the role of multicollinearity and high dimensionality in regression-based causal analyses
Learn why standard LASSO is generally inappropriate for causal estimation, and how double machine learning based on LASSO and OLS can help
Learn how to work with the DoubleML package in Python
Implement each step of the causal analysis pipeline (in Python code)
Apply the full workflow to a realistic case study
Interpret causal effect estimates in an assumption- and limitation aware way
This intermediate-level course is intended for data scientists, analysts, and similar quantitative practitioners who already have some familiarity with the basic concepts of causal inference and want to learn how to apply regression-based causal workflows in practice.
The course is not aimed at complete beginners in causal inference. Some prior exposure to causal concepts is assumed (see prerequisites below).
This course assumes familiarity with the basic concepts of causal inference and linear regression.
In particular, you should be comfortable with:
The definition and interpretation of causal quantities such as the average treatment effect (ATE) and conditional average treatment effect (CATE)
The concept of conditional exchangeability (unconfoundedness)
Causal graphs (DAGs), the backdoor criterion, and DAGitty as causal graph tool
Conditional expectation functions (CEF) and the role of linear regression as a CEF approximator
Under which conditions simple OLS models that include treatment and control variables can be used for causal estimation
Basic familiarity with Python for data analysis (e.g. working with libraries such as pandas, NumPy, or scikit-learn)
If you have completed Part I of the course series, you will be well prepared for this course. If not, but you are already comfortable with the concepts listed above, you should also be able to follow the material.
Upon completion of the course, you will receive a certificate of completion from Causal Academy.
/