What is a panel event study?

A panel event study uses difference-in-differences (DiD) methodology on panel data to compare treated and control units before and after an event, accommodating staggered treatment timing.

What is the problem with TWFE in staggered designs?

Two-Way Fixed Effects (TWFE) can produce biased estimates when treatment timing varies across units, because already-treated units can serve as implicit controls for newly-treated units.

Which DiD estimator should I use?

For staggered designs, use Callaway-Sant'Anna or Sun-Abraham. For sharp designs (all units treated at once), TWFE is appropriate. The EventStudy R package supports 6 estimators.

How many DiD estimators does the R package support?

The package supports 6 estimators: TWFE, Sun & Abraham, Callaway-Sant'Anna, de Chaisemartin-D'Haultfoeuille, Borusyak-Jaravel-Spiess, and Gardner (2022).

Panel Event Studies (Difference-in-Differences)

A panel event study combines event study methodology with Difference-in-Differences estimation to measure causal effects when a policy affects multiple units at different times. It uses untreated units as controls, with estimators like Sun & Abraham and Callaway-Sant'Anna for staggered designs.

Part of the methodology guide

Part of the Event Study Methodology Guide.

Panel event studies combine the event study framework with Difference-in-Differences (DiD) estimation. Use this approach when a policy or regulation affects multiple units at different times and you have a control group for comparison.

Panel event studies can handle datasets with 100 or more treatment cohorts entering at different times. According to Bertrand, Duflo, and Mullainathan (2004), ignoring serial correlation in panel settings leads to over-rejection of the null hypothesis in 45% of cases at the 5% level. Cluster-robust standard errors and wild bootstrap inference are essential for valid panel event study results.

When Should I Use Panel vs. Cross-Sectional Event Studies?

A panel event study is a causal inference framework that combines event study methodology with Difference-in-Differences (DiD) estimation to measure the effect of a policy or intervention across multiple units treated at potentially different times. Unlike cross-sectional event studies that rely on expected return models, panel event studies use untreated units as an explicit control group and identify effects through the parallel trends assumption. Since 2021, at least 6 heterogeneity-robust DiD estimators have been developed to correct the bias inherent in traditional Two-Way Fixed Effects regression under staggered treatment timing.

Feature	Cross-Sectional Event Study	Panel Event Study (DiD)
Treatment timing	All firms on the same date	Firms treated at different dates
Control group	Market index	Untreated firms
Identification	Expected return model	Parallel trends assumption
Best for	Market reactions to announcements	Causal effect of policies/regulations

How Do the Panel DiD Estimators Compare?

Estimator	R Method	Staggered Treatment	Heterogeneous Effects	Output	Required Package
Static TWFE	"static_twfe"	Biased	Biased	Single ATT estimate	base R
Dynamic TWFE	"dynamic_twfe"	Biased	Biased	Event-time coefficients	base R
Sun & Abraham	"sun_abraham"	Unbiased	Unbiased	Event-time coefficients	base R
Callaway & Sant'Anna	"callaway_santanna"	Unbiased	Unbiased	Group-time ATTs	did
de Chaisemartin & D'Haultfoeuille	"dechaisemartin_dhaultfoeuille"	Unbiased	Unbiased	Switchers ATT	DIDmultiplegt
Borusyak, Jaravel & Spiess	"borusyak_jaravel_spiess"	Unbiased	Unbiased	Imputed event-time coefficients	didimputation

Default recommendation

Use Sun & Abraham when treatment is staggered across units and you want event-time dynamics with no extra dependencies. Use Callaway & Sant'Anna for flexible group-time ATTs with built-in aggregation. Use Dynamic TWFE only when all units are treated at the same time.

What Data Format Does Panel DiD Require?

Panel data must be in long format with one row per unit-period:

Required panel data format

# Required columns (names are configurable)
panel_data <- tibble(
  unit_id        = ...,   # Firm/unit identifier
  time_id        = ...,   # Time period (integer)
  outcome        = ...,   # Outcome variable (e.g., stock return, revenue)
  treated        = ...,   # Treatment indicator (0/1)
  treatment_time = ...    # Period when unit first treated (NA for controls)
)

Static TWFE

The Two-Way Fixed Effects estimator, the most commonly used panel DiD specification since the 1990s, includes unit and time fixed effects to estimate a single average treatment effect:

Y_{i,t} = \alpha_i + \gamma_t + \beta \cdot D_{i,t} + \varepsilon_{i,t}

$\alpha_i$ : unit fixed effects (absorb time-invariant differences)
$\gamma_t$ : time fixed effects (absorb common shocks)
$D_{i,t}$ : treatment indicator (1 if unit $i$ is treated at time $t$ )
$\beta$ : average treatment effect on the treated (ATT)

Static TWFE estimation

panel_task <- PanelEventStudyTask$new(
  panel_data,
  unit_id = "unit_id",
  time_id = "time_id",
  treatment = "treated",
  outcome = "outcome"
)

result <- estimate_panel_event_study(panel_task, method = "static_twfe")
result$results$coefficients

Pros	Cons
Single interpretable ATT	Biased with staggered treatment
Familiar regression framework	Misses treatment dynamics
Simple to estimate	Assumes homogeneous effects

Dynamic TWFE

The dynamic specification replaces the single treatment indicator with event-time dummies, allowing you to trace the treatment effect over time and test parallel trends:

Y_{i,t} = \alpha_i + \gamma_t + \sum_{k \neq -1} \beta_k \cdot D_{i,t}^k + \varepsilon_{i,t}

$D_{i,t}^k = \mathbf{1}[t - g_i = k]$ where $g_i$ is the treatment date for unit $i$
Period $k = -1$ is omitted as the reference period
Pre-treatment coefficients ( $\beta_k$ for $k < -1$ ) test parallel trends
Post-treatment coefficients ( $\beta_k$ for $k \geq 0$ ) trace the dynamic effect

Dynamic TWFE estimation

result <- estimate_panel_event_study(
  panel_task,
  method = "dynamic_twfe",
  leads = 5,
  lags = 5,
  base_period = -1
)
plot_panel_event_study(result)

Pre-treatment coefficients near zero confirm parallel trends. The jump at period 0 shows the treatment effect.

Pros	Cons
Tests parallel trends visually	Biased with staggered treatment timing
Shows treatment dynamics	"Forbidden comparisons" between cohorts
Identifies temporary vs. permanent effects	Requires sufficient pre-treatment periods

Sun & Abraham (2021)

When treatment is staggered (units treated at different times), standard TWFE produces biased estimates because it makes "forbidden comparisons" — using already-treated units as controls for newly-treated units. As shown by Goodman-Bacon (2021), the TWFE estimator is a weighted average of all possible 2x2 DiD comparisons, and some weights can be negative, leading to sign reversal of the estimated treatment effect. The Sun & Abraham interaction-weighted estimator resolves this by estimating cohort-specific effects and aggregating properly.

Sun & Abraham estimation

staggered_task <- PanelEventStudyTask$new(staggered_data)
result <- estimate_panel_event_study(
  staggered_task,
  method = "sun_abraham",
  leads = 5,
  lags = 5
)
plot_panel_event_study(result)
result$results$coefficients

Pros	Cons
Unbiased under staggered treatment	Requires a never-treated or last-treated group
Handles heterogeneous effects across cohorts	Less efficient than TWFE when effects are homogeneous
Interpretable cohort-specific estimates	Requires more data than static TWFE

Modern DiD Estimators

Callaway & Sant'Anna (2021)

Published in the Journal of Econometrics in 2021 and cited over 5,000 times, this estimator computes group-time average treatment effects $ATT(g, t)$ — the treatment effect for cohort $g$ (units first treated at time $g$ ) at time $t$ . These can be flexibly aggregated into event-time, group, or calendar-time summaries.

The key insight: by estimating effects separately for each cohort, the estimator avoids the "forbidden comparisons" that bias TWFE. Uses propensity score or outcome regression for doubly robust estimation.

Callaway & Sant'Anna estimation

result <- estimate_panel_event_study(
  staggered_task,
  method = "callaway_santanna",
  leads = 5,
  lags = 5
)
plot_panel_event_study(result)

Pros	Cons
Doubly robust (propensity score + outcome regression)	Requires did package
Flexible aggregation of group-time ATTs	More complex output to interpret
Built-in bootstrap inference	Requires sufficient group sizes
Handles covariates naturally	Slower than Sun & Abraham for large panels

de Chaisemartin & D'Haultfoeuille (2020)

Focuses on switchers — units that change treatment status. Estimates the average effect on units that switch into treatment, using units whose treatment status does not change as controls. Particularly useful when treatment can turn on and off.

de Chaisemartin & D'Haultfoeuille estimation

result <- estimate_panel_event_study(
  staggered_task,
  method = "dechaisemartin_dhaultfoeuille",
  leads = 5,
  lags = 5
)
result$results$coefficients

Pros	Cons
Handles treatment reversals (on/off)	Requires DIDmultiplegt package
Intuitive interpretation via switchers	Computationally intensive for large panels
Robust to heterogeneous effects	Requires careful specification of placebos

Borusyak, Jaravel & Spiess (2024)

An imputation estimator that first estimates unit and time fixed effects using only untreated observations, then imputes counterfactual outcomes for treated units. The treatment effect is the difference between observed and imputed outcomes.

Borusyak, Jaravel & Spiess estimation

result <- estimate_panel_event_study(
  staggered_task,
  method = "borusyak_jaravel_spiess",
  leads = 5,
  lags = 5
)
plot_panel_event_study(result)

Pros	Cons
Efficient — uses all untreated observations	Requires didimputation package
Clean event-time coefficient estimates	Assumes treatment effect homogeneity across cohorts for efficiency
Simple imputation logic	Requires sufficient untreated observations
Fast even for large panels	Cannot handle treatment reversals

Which Estimator Should I Use?

Question	Answer	Estimator
All units treated at the same time?	Yes, no event-time dynamics needed	Static TWFE
All units treated at the same time?	Yes, need event-time dynamics	Dynamic TWFE
Staggered treatment, treatment reversals?	Yes	de Chaisemartin & D'Haultfoeuille
Staggered treatment, simplicity priority?	Yes, no extra deps	Sun & Abraham
Staggered treatment, flexible aggregation?	Yes	Callaway & Sant'Anna
Staggered treatment, efficiency priority?	Yes	Borusyak, Jaravel & Spiess

ATT (Average Treatment Effect on the Treated): The mean causal effect of the intervention on units that actually received treatment, the primary estimand in panel event studies.
Staggered Treatment: A design where different units receive treatment at different times, requiring heterogeneity-robust estimators to avoid bias from “forbidden comparisons” in standard TWFE.
Parallel Trends: The identifying assumption that treated and control units would have followed the same outcome trajectory in the absence of treatment, testable via pre-treatment coefficient plots.

How Do Cluster-Robust Standard Errors Work?

All estimators compute cluster-robust standard errors via sandwich::vcovCL(). By default, clustering is at the unit level:

Cluster-robust standard errors

# Default: clustered at unit level
result <- estimate_panel_event_study(
  panel_task, method = "sun_abraham", leads = 5, lags = 5
)

# Custom clustering variable
result <- estimate_panel_event_study(
  panel_task, method = "sun_abraham",
  leads = 5, lags = 5, cluster = "state_id"
)

Cluster at the treatment level

If treatment varies at the state level but your units are firms within states, cluster at the state level to account for within-state correlation.

What Diagnostic Checks Should I Run?

Parallel trends: Pre-treatment coefficients should be jointly zero — flat and insignificant before treatment
Visual inspection: Plot dynamic coefficients with confidence intervals; look for pre-trends
Placebo tests: Shift treatment timing earlier to check for spurious effects
Balance checks: Verify treated and control groups have similar pre-treatment characteristics

Literature

Sun, L. & Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. Journal of Econometrics, 225(2), 175–199.
Callaway, B. & Sant'Anna, P.H.C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2), 200–230.
de Chaisemartin, C. & D'Haultfoeuille, X. (2020). Two-way fixed effects estimators with heterogeneous treatment effects. American Economic Review, 110(9), 2964–2996.
Borusyak, K., Jaravel, X. & Spiess, J. (2024). Revisiting event-study designs: Robust and efficient estimation. Review of Economic Studies, 91(6), 3253–3285.
Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. Journal of Econometrics, 225(2), 254–277.
Miller, D.L. (2023). An Introductory Guide to Event Study Models. Journal of Economic Perspectives, 37(2), 203–230.

Implement this with the R package

Access advanced features and full customization through the EventStudy R package.

View on GitHub Or try the App

What Should I Read Next?

Expected Return Models — cross-sectional event study models
Synthetic Control — single-unit causal inference
Diagnostics & Export — validate and export results
Test Statistics — significance testing