Difference-in-Differences vs. Traditional Event Studies: When to Use Which
Introduction
Researchers in finance and economics frequently need to estimate the causal effect of an event or policy change on an outcome of interest. Two of the most widely used approaches for this task are the traditional event study methodology and the difference-in-differences (DiD) framework. While both methods aim to isolate causal effects, they differ fundamentally in their assumptions, data requirements, and domains of application. Understanding when to use each approach -- and when they can be combined -- is essential for producing credible empirical research.
This guide provides a detailed comparison of the two methods, explains their respective strengths and limitations, surveys modern developments in DiD estimation, and offers practical guidance for implementation.
What Is a Traditional Event Study?
A traditional event study, as developed in the finance literature, measures the impact of a specific event on the market value of a firm by analyzing abnormal stock returns. The methodology was formalized by Fama, Fisher, Jensen, and Roll (1969) and has since become one of the most widely used empirical tools in financial economics.
The approach works as follows. During an estimation window prior to the event, a model of expected returns (typically the market model) is estimated for each firm. During the event window, the abnormal return is computed as the difference between the firm's actual return and the return predicted by the model. These abnormal returns are then aggregated across firms and tested for statistical significance.
Traditional event studies are particularly well-suited to settings where: the event is well-defined and precisely dated; the outcome of interest is a financial market return; the event affects a clearly identified set of firms; and the efficient market hypothesis provides a credible basis for the counterfactual (i.e., what the return would have been without the event).
What Is Difference-in-Differences?
Difference-in-differences is an econometric technique that estimates the causal effect of a treatment or policy by comparing the change in outcomes over time between a treatment group (units affected by the event) and a control group (units not affected). The "first difference" removes time-invariant confounders within each group, and the "second difference" accounts for common time trends that affect both groups equally.
In its simplest form, the DiD estimator compares the average change in the outcome variable for treated units before and after the treatment with the corresponding change for control units. The identifying assumption is the parallel trends assumption: in the absence of treatment, the treatment and control groups would have followed the same trajectory over time. This assumption cannot be tested directly, but its plausibility can be assessed by examining pre-treatment trends.
DiD is widely used in labor economics, public policy evaluation, health economics, and increasingly in corporate finance. It is particularly valuable when the treatment is not randomly assigned and when the researcher needs to account for both time-invariant unobserved heterogeneity and common time shocks.
Key Differences Between the Two Approaches
Although both methods seek to estimate causal effects, they differ in several fundamental ways. Understanding these differences is critical for choosing the right approach for a given research question.
Counterfactual Construction
In a traditional event study, the counterfactual is constructed from a statistical model of expected returns estimated over a pre-event window. The market model, for instance, uses the firm's historical relationship with the market index to predict what its return would have been in the absence of the event. This approach relies on the validity of the expected return model and the assumption that the model parameters are stable over time.
In DiD, the counterfactual is constructed from the observed behavior of a control group. Rather than relying on a parametric model, DiD uses the change in the control group's outcome as a proxy for what would have happened to the treatment group in the absence of treatment. This makes DiD less dependent on model specification but more dependent on the quality of the control group.
Outcome Variables
Traditional event studies typically focus on stock returns as the outcome variable. The efficient market hypothesis provides a strong theoretical justification for using returns, since prices should rapidly incorporate new information. This focus on financial market outcomes limits the method's applicability to settings where stock price data are available and meaningful.
DiD, by contrast, can be applied to virtually any outcome variable -- employment, wages, firm profitability, investment, emissions, health outcomes, and many others. This flexibility makes DiD the preferred choice when the research question concerns real economic outcomes rather than market valuations, or when the affected entities are not publicly traded.
Timing and Identification
Traditional event studies require precise knowledge of the event date for each firm. The power of the method derives from the sharp change in the information environment at the announcement date. If the event date is imprecise or if information leaks before the announcement, the method loses power.
DiD is more flexible regarding timing. It can accommodate events that occur over extended periods, staggered treatment adoption across units, and gradual policy implementation. This flexibility is particularly valuable for studying regulatory changes, where different jurisdictions may adopt a policy at different times.
Control Group Requirements
Traditional event studies do not require an explicit control group. The expected return model serves as the counterfactual, and the abnormal return itself is the treatment effect estimate. Cross-sectional aggregation across multiple treated firms improves statistical power.
DiD requires an explicit control group that is credibly comparable to the treatment group. The parallel trends assumption demands that, absent treatment, the two groups would have evolved similarly over time. Identifying an appropriate control group is often the most challenging aspect of a DiD design.
When to Use Difference-in-Differences
DiD is the preferred approach in several important settings where traditional event studies are impractical or inappropriate.
- Staggered treatment adoption: When a policy or treatment is adopted by different units at different times, DiD with staggered adoption is the natural framework. For example, studying the effect of a regulation that is adopted by different US states in different years requires a staggered DiD design. Traditional event studies can also handle staggered events, but DiD provides a more natural framework for non-market outcomes.
- Non-market outcomes: When the outcome of interest is not a stock return -- for instance, employment, investment, pollution, or firm-level accounting measures -- DiD is the appropriate method. Traditional event studies are designed for financial market data and do not easily extend to other outcome variables.
- No clear event date: When the treatment does not have a sharp, well-defined event date, traditional event studies lose their power. DiD can accommodate gradual treatment adoption, pre-announcement anticipation, and extended implementation periods.
- Panel data availability: DiD naturally exploits panel data structure with repeated observations on the same units over time. When rich panel data are available, DiD can leverage the within-unit variation to control for unobserved heterogeneity more effectively.
- Need for a control group: Some research designs explicitly require comparing treated and untreated units. For example, evaluating the effect of a tax change on affected firms requires comparing them with firms not affected by the change. DiD formalizes this comparison in a rigorous framework.
Modern DiD Estimators
The DiD methodology has undergone a significant methodological revolution in recent years. Researchers have identified serious problems with the traditional two-way fixed effects (TWFE) DiD estimator in settings with staggered treatment adoption and heterogeneous treatment effects. Several new estimators have been developed to address these issues.
The Problem with Two-Way Fixed Effects
The conventional approach to DiD with staggered adoption involves a TWFE regression with unit and time fixed effects plus a treatment indicator. Goodman-Bacon (2021) showed that the TWFE estimator is a weighted average of all possible two-group, two-period DiD estimates, and crucially, some of these weights can be negative. This means that the TWFE estimator can produce biased results -- and even the wrong sign -- when treatment effects vary over time or across groups.
The problem arises because TWFE implicitly uses already-treated units as controls for newly-treated units. If treatment effects evolve over time (e.g., they grow or fade), this comparison produces biased estimates. The discovery of this problem prompted a wave of methodological innovation.
Callaway and Sant'Anna (2021)
Callaway and Sant'Anna proposed a method that estimates group-time average treatment effects -- the effect for each cohort (defined by treatment timing) at each time period. These disaggregated estimates avoid the negative weighting problem by never using already-treated units as controls. The group-time estimates can then be aggregated into summary measures using appropriate weighting schemes. This method is implemented in the R package did.
Sun and Abraham (2021)
Sun and Abraham developed an interaction-weighted estimator that modifies the standard TWFE specification to avoid contamination from heterogeneous treatment effects. Their approach interacts cohort indicators with relative time indicators and uses only never-treated or last-treated units as controls. The resulting estimator is robust to treatment effect heterogeneity and can be implemented using the fixest package in R with its sunab() function.
de Chaisemartin and D'Haultfoeuille (2020)
This approach proposes estimators that are valid under minimal assumptions, even when treatment effects are heterogeneous across groups and time. Theirdid_multiplegt command (available in both R and Stata) provides robust DiD estimates along with tests for pre-trends and treatment effect dynamics.
Borusyak, Jaravel, and Spiess (2024)
This method takes an imputation approach to DiD estimation. It first estimates the counterfactual outcomes for treated observations using only untreated observations, then computes treatment effects as the difference between observed and imputed outcomes. This approach is computationally efficient and handles complex treatment patterns naturally.
The Event Study Plot in DiD
A concept that bridges both methodologies is the "event study plot" commonly used in DiD analyses. Despite sharing the name "event study," this visualization technique differs from the traditional finance event study. In the DiD context, the event study plot displays estimated treatment effects at each relative time period (leads and lags relative to the treatment date).
The pre-treatment coefficients serve as a visual test of the parallel trends assumption: if these coefficients are close to zero and statistically insignificant, it supports the assumption that treatment and control groups were on parallel trajectories before the intervention. The post-treatment coefficients trace out the dynamic treatment effect, showing how the effect evolves over time after treatment.
This dynamic specification is estimated by including leads and lags of the treatment indicator in the regression, with one period (typically the period immediately before treatment) omitted as the reference. The resulting coefficient estimates, plotted with confidence intervals, provide a comprehensive picture of the treatment effect dynamics.
Implementation in R
For researchers working in R, several packages support modern DiD estimation with event study visualizations.
The did package by Callaway and Sant'Anna provides a comprehensive framework for estimating group-time average treatment effects with staggered adoption. The main function att_gt() computes the disaggregated estimates, and aggte() aggregates them into summary measures. The package includes built-in plotting functions for event study visualizations.
The fixest package by Berge supports the Sun and Abraham estimator through its sunab() function, which can be used within the standardfeols() regression framework. This integration makes it easy to estimate robust DiD specifications alongside other fixed effects models.
For traditional finance-style event studies, the EventStudy R package provides a complete workflow for computing abnormal returns, cumulative abnormal returns, and a comprehensive set of parametric and non-parametric test statistics. When your research question focuses on stock market reactions and you have precise event dates, this is the natural tool.
In practice, many modern research projects use both approaches as complementary analyses. A traditional event study can capture the immediate market reaction to an announcement, while a DiD analysis can examine the longer-term effects on real outcomes such as investment, employment, or profitability.
Common Mistakes to Avoid
Researchers applying either methodology should be aware of several common errors that can undermine the credibility of their findings.
- Using TWFE with heterogeneous effects: Applying the standard two-way fixed effects estimator in a staggered DiD setting without testing for treatment effect heterogeneity is one of the most common mistakes in applied research. Always consider using one of the modern DiD estimators (Callaway and Sant'Anna, Sun and Abraham, or Borusyak et al.) and compare results with the standard TWFE specification.
- Ignoring pre-trends: In DiD analyses, failing to test and report pre-treatment trends undermines the credibility of the parallel trends assumption. Always include an event study plot showing pre-treatment coefficients. Be cautious about "pre-testing bias" -- the practice of selectively reporting specifications that pass the pre-trends test.
- Wrong standard errors: In DiD settings with panel data, standard errors must account for serial correlation within units and potential cross-sectional dependence. Cluster standard errors at the unit level (or at the level of treatment assignment) as a minimum. In traditional event studies, use robust test statistics (BMP, Kolari-Pynnonen) that account for event-induced variance and cross-sectional correlation.
- Confusing the two types of "event study": The term "event study" is used in both the finance literature (referring to abnormal return analysis) and the DiD literature (referring to dynamic treatment effect estimation). These are distinct methods with different assumptions and implementations. Be precise about which method you are using and why.
- Inappropriate control group selection: In DiD, the credibility of the estimate depends entirely on the quality of the control group. Using never-treated units as controls is generally preferred over using not-yet-treated units, as the latter can introduce bias when treatment effects are dynamic. Document your control group selection criteria and justify why the parallel trends assumption is plausible.
- Neglecting anticipation effects: If firms or individuals anticipate the treatment and change their behavior before the official treatment date, both traditional event studies and DiD estimates can be biased. In DiD, allow for anticipation by setting the treatment date earlier or by explicitly modeling anticipation effects. In traditional event studies, use wider pre-event windows to capture anticipation.
- Over-interpreting long-window results: Long-window event studies (whether in the finance or DiD tradition) face well-known power and specification problems. As the window length increases, the risk of contamination from other events grows, and the identifying assumptions become less plausible. Report results for multiple window lengths and be transparent about the limitations of longer horizons.
Conclusion
Traditional event studies and difference-in-differences are complementary methods, each with distinct strengths. Traditional event studies excel at measuring the immediate market reaction to well-defined corporate or economic events, leveraging the informational efficiency of financial markets to produce precise, short-window estimates of value effects. DiD excels at estimating causal effects on a broad range of outcomes, accommodating staggered treatment adoption, and providing a transparent framework for causal inference in panel data settings.
The choice between the two methods depends on the research question, the available data, and the nature of the treatment. In many cases, using both approaches as complementary analyses strengthens the overall evidence. The recent revolution in DiD methodology -- particularly the development of estimators robust to heterogeneous treatment effects -- has expanded the toolkit available to researchers and made it more important than ever to choose methods carefully and apply them correctly.
Implement this with the R package
Access advanced features and full customization through the EventStudy R package.