In many real-world scenarios, researchers and analysts want to understand the causal impact of an intervention—but random assignment simply isn’t possible. Whether you’re evaluating a marketing campaign, a medical treatment, or a policy intervention, observational data introduces selection bias that can severely distort results.
This is where Propensity Score Matching (PSM) plays a critical role.
First introduced by Rosenbaum and Rubin (1983) in their landmark paper “The Central Role of the Propensity Score in Observational Studies for Causal Effects”, PSM has become a foundational technique in modern causal inference. Today, it is widely used across industries such as healthcare, marketing analytics, economics, public policy, and product experimentation.
This article provides a practical, end-to-end walkthrough of Propensity Score Matching in R, using up-to-date tools and industry-aligned practices—while keeping the explanation intuitive and accessible.
What Is Propensity Score Matching (in Simple Terms)?
Propensity Score Matching is a technique used to reduce selection bias in observational studies.
When treatments are not randomly assigned, treated and untreated groups often differ in systematic ways. These differences—rather than the treatment itself—can drive observed outcomes.
PSM addresses this by:
Estimating each subject’s probability of receiving treatment, given observed characteristics
Matching treated and untreated subjects with similar propensity scores
Comparing outcomes only among these matched subjects
The goal is to approximate a randomized experiment as closely as possible using observational data.
Why PSM Matters: An Intuitive Example
In a controlled lab experiment with rats, researchers can ensure:
Identical genetics
Identical environments
Random treatment assignment
Under these conditions, any observed difference is plausibly caused by the treatment.
With people, however:
Individuals differ by age, income, preferences, and behavior
Participation in treatments (like ads or programs) is often voluntary
Outcomes may reflect pre-existing differences, not the treatment itself
Propensity Score Matching helps control for these observable differences.
A Real-World Use Case: Marketing Campaign Effectiveness
Imagine a marketer wants to evaluate whether an advertising campaign increases product purchases.
Some customers respond to the campaign
Others do not
Responders may already differ (income, age, spending habits)
Without adjustment, a simple comparison would be misleading.
PSM allows us to:
Match responders and non-responders with similar demographics
Estimate the incremental effect of the campaign
Answer a more causal question: What would have happened if responders had not been exposed?
The Dataset
We’ll work with a simulated dataset of 1,000 individuals containing:
Age
Income
Ad_Campaign_Response
1 = Responded
0 = Did not respond
Bought
1 = Purchased
0 = Did not purchase
This structure mirrors many real-world marketing and behavioral datasets.
Baseline Analysis: Naïve Regression
Before matching, we estimate the effect of the campaign using a linear model:
model_1 <- lm(Bought ~ Ad_Campaign_Response + Age + Income, data = Data)
The coefficient on Ad_Campaign_Response suggests a ~73% increase in purchase probability.
While this estimate is informative, it relies heavily on model assumptions and may still reflect selection bias.
PSM offers a complementary, design-based approach.
Step 1: Estimating Propensity Scores
Propensity scores are estimated using logistic regression, where treatment assignment is modeled as a function of observed covariates:
pscores.model <- glm(
Ad_Campaign_Response ~ Age + Income,
family = binomial("logit"),
data = Data
)
Each individual receives a predicted probability of responding to the campaign—this is their propensity score.
In modern workflows, these scores are typically used only for matching—not for outcome modeling.
Step 2: Assessing Covariate Balance Before Matching
Before matching, we examine whether treatment and control groups differ systematically.
Using the tableone package:
CreateTableOne(
vars = c("Age", "Income"),
strata = "Ad_Campaign_Response",
data = Data,
test = FALSE
)
Key metric: Standardized Mean Difference (SMD)
SMD < 0.1 → acceptable balance
SMD > 0.1 → potential confounding
Even when covariates appear balanced, matching can still improve robustness.
Step 3: Matching Algorithms in Practice
Exact Matching
Matches subjects with identical covariate values.
Very strict
Often discards large portions of data
Useful when covariates are categorical and limited
match1 <- matchit(
Ad_Campaign_Response ~ Age + Income,
method = "exact",
data = Data
)
Exact matching often results in smaller samples and reduced statistical power.
Nearest Neighbor Matching (Industry Standard)
The most commonly used approach in applied work.
Matches each treated unit to the closest control unit
Operates on propensity score distance
Balances bias and sample size effectively
match2 <- matchit(
Ad_Campaign_Response ~ Age + Income,
method = "nearest",
ratio = 1,
data = Data
)
After matching, balance diagnostics typically show:
Dramatically reduced SMDs
Equal sample sizes across groups
Strong overlap in propensity score distributions
This approach aligns with current best practices in marketing analytics and health economics.
Step 4: Evaluating Balance After Matching
Re-running CreateTableOne() on the matched data confirms whether balance has improved.
In our case:
Age and Income SMDs drop close to zero
Treatment and control groups are now comparable
At this point, design precedes analysis, which is a core principle of modern causal inference.
Step 5: Outcome Analysis on Matched Data
With balanced groups, we test our hypothesis:
Responding to the ad campaign increases the probability of purchase.
We compute pairwise differences and conduct a paired t-test:
t.test(difference)
Results:
Highly statistically significant effect
Estimated treatment effect ≈ 0.73
Interpreted as a 73 percentage-point increase in purchase probability due to campaign exposure
This estimate closely aligns with the regression result—but now rests on a stronger causal foundation.
Key Takeaways
Propensity Score Matching is a design strategy, not just a statistical trick
It is most effective when:
Treatment assignment is non-random
Key confounders are observed
Nearest neighbor matching is the most widely used approach in practice
Balance diagnostics (SMDs, plots) are more important than p-values
PSM complements—not replaces—regression modeling
Final Thoughts
In today’s data-driven industries, causal questions are everywhere—but randomized experiments aren’t always feasible. Propensity Score Matching remains one of the most practical and intuitive tools for bridging that gap.
When used thoughtfully, PSM helps analysts move beyond correlation and closer to credible causal insight—whether you’re measuring campaign ROI, evaluating treatments, or informing strategic decisions.
Our mission is “to enable businesses unlock value in data.” We do many activities to achieve that—helping you solve tough problems is just one of them. For over 20 years, we’ve partnered with more than 100 clients — from Fortune 500 companies to mid-sized firms — to solve complex data analytics challenges. Our services include power bi freelancers, and marketing analytics company— turning raw data into strategic insight.
Top comments (0)