Causal inference
Causal inference is the study of whether an intervention, exposure, or policy actually changes an outcome. It combines study design, assumptions, statistical methods, and domain knowledge to move beyond association toward answers about cause and effect.
What causal inference is
Causal inference asks what would happen if something were changed: a drug is given, a policy is introduced, a price is raised, a curriculum is changed, or an exposure is removed. That is different from ordinary prediction, which may only ask which variables are associated with an outcome. A causal claim is directional and intervention-based. It says not just that two things move together, but that changing one would change the other under a defined set of conditions.
The counterfactual problem
The central difficulty is that each unit has multiple possible outcomes, but only one is observed. A patient either receives a treatment or does not. A city either adopts a policy at a given time or does not. The unobserved alternative is the counterfactual. Causal inference methods try to estimate that missing alternative. The quality of the answer depends on whether the comparison group is a credible stand-in for what would have happened without the intervention.
Randomized experiments
Randomized controlled trials are often the cleanest causal design because random assignment helps balance both measured and unmeasured factors across treatment groups. If the trial is well run, differences in outcomes can be attributed more confidently to the intervention. Randomization is not magic. Trials can still suffer from noncompliance, attrition, measurement problems, spillovers, small samples, or limited external validity. A trial answers a causal question for a setting, population, treatment, and follow-up period, not every possible future use.
Observational studies
Many important questions cannot be randomized for ethical, practical, or political reasons. Observational causal inference uses data from the world as it happened, then tries to construct a fair comparison. Common approaches include matching, inverse-probability weighting, regression adjustment, instrumental variables, difference-in-differences, regression discontinuity, interrupted time series, and synthetic controls. These methods can be powerful, but each rests on assumptions that the data alone may not prove.
Confounding and selection
Confounding happens when a variable influences both the treatment and the outcome. For example, people who choose a health behavior may differ from others in income, access to care, age, risk tolerance, or baseline health. If those differences are not handled, an estimated effect may partly reflect the confounder rather than the treatment. Selection bias is related but broader: the people, cases, or times included in a study may not represent the target question. Causal work often begins by asking why units received treatment, why outcomes were measured, and who is missing from the data.
Causal graphs and assumptions
Directed acyclic graphs, or DAGs, are diagrams that encode assumptions about how variables causally relate to one another. They help researchers decide which variables to adjust for, which paths create confounding, and which variables should not be controlled because they are mediators or colliders. A graph does not make assumptions true. It makes them visible. That visibility is valuable because causal conclusions depend on background knowledge about timing, mechanisms, measurement, and omitted variables.
Effect sizes and heterogeneity
A causal estimate is usually about a defined effect: the average treatment effect, the effect among treated units, a local effect for compliers, or an effect for a subgroup. These are not interchangeable. Treatment effects can vary by age, place, baseline risk, implementation quality, and time. A policy may help one group and harm another, while the average looks modest. Good causal analysis asks not only whether something works, but for whom, under what conditions, and compared with what alternative.
Why it matters
Causal inference matters because decisions are interventions. Doctors, schools, governments, companies, and public-health agencies need to know what actions are likely to change outcomes, not only what variables are predictive. The field also teaches humility. A causal estimate is only as good as its design, assumptions, measurements, and sensitivity checks. Clear causal thinking can prevent confident but misleading claims, especially when big datasets make weak comparisons look precise.