Skip to Content

Causal AI & LLM synergies: Enterprise decision making needs more than chatbots

Read more
  • Blog

Estimating Causal Effects

13 June 2023, 11:20 GMT

Structural Causal Model

Structural causal models (SCMs) are a type of model used in causal inference to represent the relationships between variables and how they cause each other.

Unlike a standard ML model in which the objective is to develop predictive relationships, SCMs are optimized to give us the correct treatment effect between variables: they provide a framework for testing hypotheses about how changes to one variable will affect other variables in the system, allowing us to estimate causal effects from observational data, which is particularly important when it is not possible or ethical to conduct randomized controlled experiments.

SCMs are widely used in fields such as economics, social sciences, epidemiology, and computer science, among others. They have many practical applications, such as understanding the effects of policy interventions, identifying the factors that contribute to disease outbreaks, modeling relationships in a supply-chain, measuring the impact of advertisement spend, etc.

Read more on causalnet, our proprietary structural causal model.

Causal Effect Estimation

Causal effect estimation is a fundamental aspect of empirical research, where understanding the impact of interventions or policies is critical. The goal of causal effect estimation is to quantify the effect of a particular treatment or intervention on an outcome of interest while controlling for other factors that might affect the outcome. One approach to estimating causal effects is through randomized controlled trials (RCTs), where participants are randomly assigned to either a treatment or a control group. Randomization helps ensure that any differences observed in outcomes between the two groups are due to the treatment rather than other factors.

In cases where RCTs are not feasible or ethical, instrumental variables (IV) analysis can be used to estimate causal effects. IV analysis employs an instrumental variable that is related to the treatment but not directly related to the outcome of interest. This instrumental variable is used to estimate the treatment effect on the outcome, while controlling for other factors that might influence the outcome. The validity of the instrumental variable depends on the so-called “exclusion restriction” condition, which requires that the instrumental variable affects the outcome solely through its effect on the treatment.

Propensity score matching (PSM) is another approach to estimate causal effects, which aims to create a comparison group that is similar to the treatment group by matching participants based on a set of covariates or characteristics. This helps address the issue of selection bias, where participants who receive the treatment may differ systematically from those who do not. PSM can be used to estimate the treatment effect while controlling for other factors that might influence the outcome. However, PSM assumes that the covariates used for matching are sufficient to control for all confounding variables, which may not always be the case.

Finally, the back-door criterion is another approach to estimate causal effects, and involves identifying a set of variables that can block all “back-door” paths between the treatment and the outcome. The back-door criterion allows for estimation of causal effects by conditioning the data on these variables, also known as an adjustment set, and marginalizing using an appropriate formula. The back-door criterion requires strong assumptions about the causal structure of the data, and its validity depends on the accuracy of these assumptions.

In summary, estimating causal effects is essential to determine the effectiveness of interventions or policies. RCTs, IV analysis, PSM, and the back-door criterion are all useful methods to estimate causal effects, each with its advantages and limitations. Choosing the appropriate method depends on the research question, the available data, and the feasibility of each method.

Synthetic Controls

Synthetic controls are a statistical method used in causal inference to estimate the effect of an intervention or treatment when a randomized control group is not available. In essence, synthetic controls create a counterfactual, which is a combination of units (e.g., individuals, firms, or regions) that did not receive the treatment, to estimate what would have happened to the treated unit in the absence of the treatment. This method is particularly useful in the analysis of aggregate data such as in economics, social sciences, and policy evaluations, where the ‘synthetic control’ acts as a comparison group that mimics the characteristics of the treated unit prior to the intervention.

One of the classic examples of the usage of synthetic controls is in the evaluation of the impact of California’s tobacco control program, where researchers created a synthetic California from a combination of other states to estimate the impact of the program on tobacco consumption. Synthetic controls can also be applied to evaluate the effects of policy changes, natural disasters, and health interventions, among others.

The synthetic control method offers several advantages. Firstly, it allows for the estimation of causal effects in situations where randomization is not feasible, such as in historical or policy analysis. This makes it particularly useful for evaluating the impact of real-world interventions. Secondly, it enables practitioners to create a more comparable control group, as the synthetic control can be constructed to closely match the pre-intervention characteristics of the treated unit. Additionally, the method is transparent and intuitive, as it uses a weighted combination of control units which can be explicitly reported and examined.

Despite its strengths, the synthetic control method is not without risks. One potential drawback is that the method relies heavily on the assumption that the combination of units used to create the synthetic control can accurately represent the counterfactual scenario. If there are unobserved factors that differ between the synthetic control and the treated unit, the estimates may be biased.

To mitigate some of these risks, decisionOS comes equipped with Human-in-the-loop causal discovery, allowing practitioners and researchers to collaborate in building a causal graph that encapsulates all the knowledge necessary to create synthetic controls that accurately represent the scenarios.

In conclusion, synthetic controls represent an innovative and powerful tool in the arsenal of researchers and policy analysts aiming to discern the causal impact of interventions in complex settings. By crafting a meticulously weighted combination of control units, synthetic controls enable us to traverse the barriers of traditional experimental designs and shed light on the nuanced effects of interventions. While there are limitations, embracing the synthetic control method can unleash new frontiers in evidence-based decision-making and policy evaluation, ultimately bringing the ability to drastically improve decision making.

Double ML

Double Machine Learning (Double ML), sometimes referred to as Doubly Robust Estimators, is a methodology in causal inference that aims to estimate treatment effects while addressing concerns of model misspecification. Essentially, Double ML uses machine learning techniques to account for confounding variables that might bias the treatment effect estimates. The term “double” comes from the idea that the method involves two separate steps: first, it uses machine learning to estimate the relationships of the covariates to both treatment and outcome, and then it uses these estimates to adjust for the treatment effect on the outcome.

Double ML has been employed in various fields such as healthcare, economics, and social sciences. For instance, in healthcare, Double ML might be used to estimate the causal effect of a new drug on patient recovery rates, while accounting for various confounding factors like age, gender, and pre-existing conditions. In economics, it is often applied to assess the impact of policy changes, such as tax reforms, on economic indicators, taking into consideration other influencing factors. Moreover, in the field of education, researchers might use Double ML to evaluate the effect of different educational interventions on student performance while controlling for background characteristics.

One of the main advantages of Double ML is its robustness to model misspecification. Since it employs machine learning techniques, it can capture complex relationships among variables without relying on strict parametric assumptions. Additionally, Double ML excels in high-dimensional settings where the number of covariates is large, as it effectively deals with the curse of dimensionality. The methodology also offers flexibility, as various machine learning algorithms can be used in the first step of the process, allowing researchers to choose the ones that best fit their data and objectives.

However, Double ML is not without its challenges. The effectiveness of this method relies on the quality and appropriateness of the machine learning algorithms employed. Incorrectly specified or poorly tuned algorithms can lead to biased estimates. Moreover, while Double ML is robust to model misspecification, it still relies on the assumption of unconfoundedness (i.e., that all confounders have been observed and properly accounted for), and violations of this assumption can undermine the validity of the causal estimates. Additionally, the method can be computationally intensive, especially with large datasets and complex algorithms.

In decisionOS, doubleML works as an engine that can be natively used by causalNet for training, or can also be used by any sklearn compatible ML algorithm. This allows practitioners to carefully select the appropriate algorithm that works given the context on their data, or through an optimization process, select the best algorithms given a carefully designed objective function.

Double Machine Learning stands as a pioneering breakthrough in causal inference, adeptly merging the strengths of machine learning with traditional econometric techniques. By harnessing the power of sophisticated algorithms, Double ML adeptly navigates the labyrinth of high-dimensional covariate spaces and brings to light more reliable and unbiased treatment effect estimates. While it is imperative to approach the methodology with astuteness regarding its assumptions and computational demands, Double ML undoubtedly marks a significant stride toward more rigorous and incisive causal analyses. Through its adoption, we are poised to unravel deeper insights and cultivate more judicious and impactful decision-making across myriad domains.


Meta-learners are a class of machine learning models designed to estimate causal effects in observational data. They are statistical techniques that make use of supervised learning methods to estimate the Conditional Average Treatment Effect (CATE).

The main goal of a meta-learner is to estimate the treatment effect as a function of the covariates, allowing for personalized treatment effect estimation. This is important when the effect of the treatment (the intervention or exposure) varies across different subgroups within the population.

For instance, let’s say we want to understand what is the impact of a certain drug on the population: while it may be valuable to understand the average effect, in reality there may be subsections of the population for whom the drug may not work as well – or even lead to negative outcomes. With meta-learners, we can obtain unbiased estimates of the effect of the treatment on the population, while digging into each individual cohort to understand the heterogeneities.

The four main types of meta-learners for causal inference are:

  • S-Learner (Single-Learner): This approach involves training a single model on the data, considering the treatment assignment as an additional feature. The effect of the treatment is then computed as the difference in the prediction for treated vs. untreated instances.
  • T-Learner (Two-Learner): The T-learner approach uses two separate models, one for the treated group and one for the control group. The difference in the predictions of the two models is taken as the treatment effect.
  • X-Learner: The X-learner first uses a T-learner to estimate the treatment effect, and then uses the outcomes from the treated and control groups to generate a new set of ‘pseudo’ outcomes. These pseudo-outcomes are then used to re-estimate the treatment effect. This approach can be particularly effective when there is a significant imbalance between the treated and control groups.

These meta-learners provide a powerful toolset for estimating heterogeneous treatment effects, which can be highly valuable in domains like healthcare, economics, and social sciences where understanding the variable impact of interventions on different groups is crucial.

Propensity Score Matching (PSM)

Propensity Score Matching (PSM) is a widely-used statistical technique in causal inference, aimed at estimating the effect of a treatment or intervention by accounting for the covariates that predict receiving the treatment. In observational studies, treatment assignment is often not random, leading to biased estimates of treatment effects. Propensity score matching seeks to mitigate this bias by pairing individuals who received the treatment with similar individuals who did not, based on the propensity scores. The propensity score is essentially the probability of receiving the treatment given a set of observed characteristics.

Propensity Score Matching is utilized in various fields including medicine, economics, education, and social sciences. In medicine, PSM might be used to estimate the effect of a drug by matching treated patients with control patients who have similar health characteristics. In economics, researchers might use PSM to evaluate the impact of training programs on employment, by matching individuals who participated in the program with those who did not but had similar observable characteristics. Similarly, in education, PSM can be employed to assess the effectiveness of educational interventions by matching students who received an intervention with similar students who did not.

The primary advantage of Propensity Score Matching is its ability to reduce bias due to confounding variables, allowing for more credible estimates of causal effects in observational data. By matching on the propensity score, one can ensure that the distribution of observed covariates is similar between the treated and control groups, which is akin to achieving balance in a randomized experiment. Moreover, PSM is particularly useful when dealing with a large number of covariates, as it condenses them into a single score, making the matching process more manageable.

However, there are limitations and potential risks associated with Propensity Score Matching. Firstly, PSM can only control for observed confounders, and if there are unobserved confounders, the estimates may still be biased. Secondly, the quality of matches can be sensitive to the choice of the algorithm and caliper width, and poor matches can lead to unreliable estimates. Additionally, PSM often results in a loss of data, as not all treated units may find a suitable match, which can reduce the efficiency and generalizability of the estimates.

In decisionOS, PSM is implemented as part of the causal effect estimation package, that contains all the main classical methods for causal inference. In more complex scenarios or in high dimensional datasets, PSM can also be enhanced by the usage of more advanced methods to pair individuals that go beyond a simple propensity estimation.

In conclusion, Propensity Score Matching is an invaluable tool in the quest for causal understanding, particularly in settings where randomized control is not feasible. With its capacity to approximate randomized experiments through meticulous matching, PSM empowers researchers and policymakers to extract meaningful insights from observational data. While one must remain vigilant of its limitations and the importance of careful implementation, Propensity Score Matching undoubtedly allows for a world of possibilities for robust and insightful causal analysis, allowing for improved decision-making across a variety of fields.

image description