Which test statistic should I use for my event study?

For most multi-event studies, start with the BMP (Boehmer-Musumeci-Poulsen) test because it handles event-induced variance changes. Add a non-parametric test (Rank test or Generalized Sign test) as a robustness check. If event dates are clustered, use the Kolari-Pynnonen test or Calendar-Time Portfolio approach.

What is the difference between parametric and non-parametric event study tests?

Parametric tests (t-test, Patell Z, BMP) assume normally distributed abnormal returns and use variance estimates for inference. Non-parametric tests (Sign test, Rank test) make no distributional assumptions and are more robust to outliers and skewed returns, but generally have lower power when normality holds.

How do I handle event-induced variance in an event study?

Use the BMP test (Boehmer-Musumeci-Poulsen, 1991), which standardizes abnormal returns by the cross-sectional standard deviation rather than the estimation-window variance. This accounts for the fact that return variance often increases around events like earnings announcements or M&A deals.

What should I do if my event dates are clustered?

Clustered event dates (e.g., regulatory changes affecting all firms on the same day) violate the cross-sectional independence assumption of standard tests. Use the Kolari-Pynnonen test, the Calendar-Time Portfolio approach, or a crude dependence adjustment to obtain valid inference.

Choosing the Right Test Statistic

The choice of test statistic determines whether your event study produces valid inference. Different tests make different assumptions about the distribution of abnormal returns, the behavior of variance around the event, and the independence of events across firms. Using the wrong test can lead to over-rejection (finding effects that are not there) or under-rejection (missing real effects). This page provides a systematic decision framework for selecting the right test.

Part of the Methodology Guide

This page is part of the Event Study Methodology Guide. For formulas and implementation details of individual tests, see AR & CAR Tests and AAR & CAAR Tests.

Decision Framework

A test statistic in event studies is a numerical measure used to determine whether observed abnormal returns are statistically distinguishable from zero. The choice among parametric, non-parametric, and variance-robust tests directly affects the validity of inference, with simulation studies showing that the wrong test can inflate rejection rates from a nominal 5% to over 70% in adverse conditions.

Selecting a test statistic involves answering four questions about your study design. Each question narrows the set of appropriate tests.

Question	If Yes	If No
1. Single event or multiple events?	Use AR/CAR tests (single-event)	Use AAR/CAAR tests (multi-event)
2. Can you assume normally distributed returns?	Parametric tests are valid	Use non-parametric tests (Sign, Rank)
3. Does the event change return variance?	Use variance-robust tests (BMP, Kolari-Pynnonen)	Standard tests (Patell Z, Cross-Sectional t) are adequate
4. Are event dates clustered across firms?	Use clustering-robust tests (Kolari-Pynnonen, Calendar-Time)	Cross-sectional independence assumption holds

Parametric vs. Non-Parametric Tests

The first major distinction is between parametric and non-parametric tests. Each class has strengths and weaknesses.

Parametric Tests

Parametric tests assume that abnormal returns follow a known distribution (typically normal). They use the estimated variance of abnormal returns to construct test statistics. When the distributional assumptions hold, parametric tests are more powerful than non-parametric alternatives.

Test	Key Assumption	Robust To	Not Robust To
Cross-Sectional t-test	AR ~ Normal, equal variance	Heterogeneous event effects	Event-induced variance; non-normality
Patell Z	Standardized AR ~ Normal	Heterogeneous firm-specific variance	Event-induced variance; clustering
BMP test	Standardized AR ~ Normal	Event-induced variance changes	Clustering; severe non-normality
Kolari-Pynnonen	Standardized AR ~ Normal	Event-induced variance + clustering	Severe non-normality

Non-Parametric Tests

Non-parametric tests make minimal distributional assumptions. They are particularly useful when returns are skewed, heavy-tailed, or contain outliers — conditions that are common in practice, especially for small-cap stocks and emerging markets.

Test	How It Works	Strengths	Weaknesses
Sign test	Tests whether the proportion of positive ARs exceeds 50%	Robust to outliers and non-normality	Low power; ignores AR magnitude
Generalized Sign test	Adjusts the expected proportion using estimation window data	Accounts for asymmetric return distributions	Still ignores magnitude; requires estimation window data
Rank test	Ranks ARs against estimation window residuals	Robust to non-normality; uses magnitude information	Assumes symmetric distribution under H0

Use both

Best practice is to report both parametric and non-parametric test results. If they agree, your conclusions are robust. If they disagree, investigate why — it often reveals data quality issues or distributional violations.

Handling Event-Induced Variance

Many events cause return variance to increase on the event date. For example, earnings announcements typically increase daily return variance by a factor of 2-3, while M&A announcements can increase variance by a factor of 4-10 for target firms. According to Boehmer, Masumeci, and Poulsen (1991), ignoring this variance increase leads to severe over-rejection of the null hypothesis. This phenomenon is called event-induced variance.

Standard tests like the Patell Z assume that the variance of abnormal returns in the event window equals the estimation-window variance. When event-induced variance is present, this assumption is violated, and the Patell Z test over-rejects the null hypothesis — it reports significant effects even when there are none.

Test	Handles Event-Induced Variance?	How
Cross-Sectional t-test	Partially	Uses cross-sectional variance, which captures some variance increase
Patell Z	No	Uses estimation-window variance only
BMP test	Yes	Standardizes by estimation-window variance, then uses cross-sectional variance of standardized ARs
Kolari-Pynnonen	Yes	Extends BMP with clustering adjustment
Sign test	Yes	Does not use variance estimates
Rank test	Partially	Rank transformation reduces the effect of variance changes

The BMP test (Boehmer, Musumeci, and Poulsen, 1991) is the most widely recommended solution. It first standardizes each firm's abnormal return by its estimation-window standard deviation, then computes the cross-sectional standard deviation of these standardized abnormal returns. This two-step approach captures event-induced variance in the cross-sectional step.

Use the BMP test for event-induced variance

# Configure with BMP test
ps <- ParameterSet$new(
  multi_event_statistics = MultiEventStatisticsSet$new(
    tests = list(
      BMPTest$new(),           # Robust to event-induced variance
      CSectTTest$new(),        # Comparison (not robust)
      PatellZTest$new(),       # Comparison (not robust)
      GeneralizedSignTest$new()  # Non-parametric robustness check
    )
  )
)

task <- run_event_study(task, ps)

Handling Cross-Correlation (Event Clustering)

When multiple firms experience the event on the same date (or within overlapping event windows), their abnormal returns are cross-sectionally correlated. This violates the independence assumption of most tests and causes over-rejection.

Common scenarios with clustered events include:

Regulatory changes: A new law affects all firms in an industry on the same date.
Macroeconomic announcements: Interest rate decisions, GDP releases, or employment reports affect all stocks simultaneously.
Industry-wide events: A product recall or safety incident that affects all competitors.

Approach	Test / Method	When to Use
Clustering-adjusted test	Kolari-Pynnonen	Moderate clustering; want to keep standard event study framework
Calendar-Time Portfolio	CalendarTimePortfolioTest	Severe clustering; events concentrated on few dates
Crude Dependence Adjustment	Manual adjustment to Patell Z	Quick adjustment; moderate clustering
Portfolio approach	Aggregate firms into one portfolio per event date	All firms share the same event date

Handle clustered events

# Kolari-Pynnonen: robust to both variance change and clustering
ps <- ParameterSet$new(
  multi_event_statistics = MultiEventStatisticsSet$new(
    tests = list(
      KolariPynnonenTest$new(),   # Robust to variance + clustering
      CalendarTimePortfolioTest$new(), # Portfolio approach
      BMPTest$new(),              # Robust to variance only (comparison)
      RankTest$new()              # Non-parametric check
    )
  )
)

task <- run_event_study(task, ps)

Do not ignore clustering

Ignoring cross-correlation when events are clustered leads to severely inflated test statistics. Kolari and Pynnonen (2010) show that the actual rejection rate of the Patell Z test can exceed 70% at a nominal 5% significance level when events are clustered. Always check whether your event dates overlap.

Recommended Tests by Scenario

The table below provides concrete recommendations for common event study scenarios. Each recommendation includes a primary test and a robustness check.

Scenario	Primary Test	Robustness Check	Rationale
M&A announcements (target firms)	BMP test	Rank test	Large abnormal returns cause variance increase; target CARs are well-behaved
Earnings announcements	BMP test	Generalized Sign test	Strong event-induced variance; large samples available
Regulatory changes (industry-wide)	Kolari-Pynnonen	Calendar-Time Portfolio	Same event date for all firms; cross-correlation is severe
ESG events (heterogeneous dates)	BMP test	Sign test	Dates vary across firms; event-induced variance moderate
Small sample (N < 20)	Cross-Sectional t-test	Sign test	BMP and Patell Z require larger samples for asymptotic properties
Non-normal returns (small caps)	Rank test	Generalized Sign test	Parametric tests unreliable with heavy tails and skewness
Long-run studies (BHAR)	Skewness-adjusted t-test	Bootstrap	BHAR returns are severely right-skewed; standard t-tests invalid
Single-firm case study	AR t-test + CAR t-test	Permutation test	No cross-sectional aggregation; use exact tests if possible

Decision Tree

Follow this decision tree to select your test statistics. Start at the top and follow the path that matches your study.

Step	Condition	Action
1	Single firm?	Use AR t-test + CAR t-test. Add Permutation test for robustness. Stop.
2	Multiple firms. Events on same date?	Go to Step 2a.
2a	Yes, clustered dates.	Use Kolari-Pynnonen + Calendar-Time Portfolio + Rank test. Stop.
2b	No, non-clustered.	Go to Step 3.
3	Event likely changes variance?	Go to Step 3a.
3a	Yes, event-induced variance.	Use BMP test + Generalized Sign test. Stop.
3b	No, stable variance.	Go to Step 4.
4	Returns approximately normal?	Go to Step 4a.
4a	Yes, normal returns.	Use Cross-Sectional t-test + Patell Z + Sign test. Stop.
4b	No, non-normal returns.	Use Rank test + Generalized Sign test. Stop.

How Do I Configure Multiple Tests in R?

The EventStudy package allows you to run multiple tests simultaneously. This makes it easy to compare results and assess robustness.

Comprehensive test configuration

# Recommended setup: parametric + non-parametric + variance-robust
ps <- ParameterSet$new(
  # Single-event tests
  single_event_statistics = SingleEventStatisticsSet$new(
    tests = list(
      ARTTest$new(),
      CARTTest$new()
    )
  ),
  # Multi-event tests
  multi_event_statistics = MultiEventStatisticsSet$new(
    tests = list(
      # Parametric
      CSectTTest$new(),       # Baseline
      PatellZTest$new(),      # Standardized
      BMPTest$new(),          # Variance-robust
      KolariPynnonenTest$new(), # Variance + clustering robust

      # Non-parametric
      SignTest$new(),
      GeneralizedSignTest$new(),
      RankTest$new()
    )
  )
)

task <- run_event_study(task, ps)

# View all test results
task$get_aar()

One-sided tests

# One-sided tests (e.g., testing for positive abnormal returns only)
ps <- ParameterSet$new(
  multi_event_statistics = MultiEventStatisticsSet$new(
    tests = list(
      CSectTTest$new(confidence_type = "one-sided"),
      BMPTest$new(confidence_type = "one-sided")
    )
  )
)

Common Mistakes

Using only the Patell Z test: The Patell Z is popular because it is well-known, but it is not robust to event-induced variance. Always include the BMP test.
Ignoring clustering: If your events share event dates (even partially overlapping windows), standard tests will produce spuriously significant results.
Relying on a single test: No single test is best in all scenarios. Report at least two tests (one parametric, one non-parametric) to demonstrate robustness.
Using parametric tests with small samples: With fewer than 20 events, asymptotic properties of Patell Z and BMP may not hold. Prefer the Cross-Sectional t-test and non-parametric tests.
Not checking normality: Run a Shapiro-Wilk test on the estimation-window residuals or inspect a QQ-plot before relying on parametric tests.

Literature

Boehmer, E., Masumeci, J. & Poulsen, A.B. (1991). Event-study methodology under conditions of event-induced variance. Journal of Financial Economics, 30(2), 253-272.
Kolari, J.W. & Pynnonen, S. (2010). Event study testing with cross-sectional correlation of abnormal returns. Review of Financial Studies, 23(11), 3996-4025.
Corrado, C.J. (1989). A nonparametric test for abnormal security-price performance in event studies. Journal of Financial Economics, 23(2), 385-395.
Patell, J.M. (1976). Corporate forecasts of earnings per share and stock price behavior. Journal of Accounting Research, 14(2), 246-276.

Parametric Test: A significance test that assumes abnormal returns follow a known distribution (typically normal). Examples include the Cross-Sectional t-test, Patell Z, and BMP test. More powerful when distributional assumptions hold.
Non-Parametric Test: A significance test that makes minimal distributional assumptions. Examples include the Sign test, Generalized Sign test, and Rank test. More robust to outliers and skewed returns.
Event-Induced Variance: The increase in return variance that often occurs around the event date. Failing to account for it causes standard tests to over-reject the null hypothesis, producing false positives.

Implement this with the R package

Access advanced features and full customization through the EventStudy R package.

View on GitHub Or try the App

What Should I Do Next?

AR & CAR Test Statistics — formulas and code for single-event tests
AAR & CAAR Test Statistics — formulas and code for multi-event tests
Variance-Based Tests — deep dive into event-induced variance handling
Inference & Robustness — wild bootstrap and multiple testing corrections