AAR & CAAR Test Statistics
AAR and CAAR test statistics determine whether the average abnormal return across a sample of firms is significantly different from zero. Eight tests are available, from the cross-sectional t-test to the Kolari-Pynnonen test that handles event-induced variance and cross-sectional correlation.
Part of the methodology guide
Part of the Event Study Methodology Guide.
Multi-event test statistics assess whether the average abnormal return across a sample of firms is significantly different from zero. These are the tests reported in most published event studies.
How Do the Multi-Event Tests Compare?
AAR and CAAR test statistics are the backbone of multi-event inference in financial event studies. They aggregate abnormal returns across a sample of firms to determine whether an event type produces a statistically significant market reaction on average. First formalized by Patell (1976) and extended by Boehmer, Musumeci, and Poulsen (1991), these tests range from simple cross-sectional t-tests to advanced procedures that correct for event-induced variance and cross-sectional correlation.
| Test | R Class | Type | Assumes Normality | Handles Event-Induced Variance | Handles Clustering |
|---|---|---|---|---|---|
| Cross-Sectional t | CSectTTest | Parametric | Yes | No | No |
| Patell Z | PatellZTest | Parametric | Yes | No | No |
| BMP | BMPTest | Parametric | Yes | Yes | No |
| Sign test | SignTest | Non-parametric | No | No | No |
| Generalized Sign | GeneralizedSignTest | Non-parametric | No | No | No |
| Rank test | RankTest | Non-parametric | No | Yes | No |
| Kolari-Pynnonen | KolariPynnonenTest | Parametric | Yes | Yes | Yes |
| Calendar-Time Portfolio | CalendarTimePortfolioTest | Parametric | Yes | No | Yes |
How Do I Run All Tests in R?
# Run with all 8 multi-event tests
ps <- ParameterSet$new(
multi_event_statistics = MultiEventStatisticsSet$new(
tests = list(
CSectTTest$new(),
PatellZTest$new(),
BMPTest$new(),
KolariPynnonenTest$new(),
SignTest$new(),
GeneralizedSignTest$new(),
RankTest$new(),
CalendarTimePortfolioTest$new()
)
)
)
task <- EventStudyTask$new(firm_data, index_data, request)
task <- run_event_study(task, ps)Parametric Tests
Cross-Sectional t-test (CSect T)
The simplest multi-event test, used in over 60% of published event studies. Computes the AAR as the mean of abnormal returns across firms, then tests whether it differs from zero using the cross-sectional standard deviation.
AAR:
where .
CAAR:
where .
| Pros | Cons |
|---|---|
| Simple, intuitive | Assumes equal variance across firms |
| No model estimates needed for SE | Does not account for event-induced variance |
| Robust to different return models | Sensitive to outliers |
Patell Z Test
Introduced by Patell in 1976, this test standardizes each firm's abnormal return by its own estimation-window standard deviation before aggregating. This gives less weight to volatile stocks and has been cited in over 2,500 studies according to Google Scholar.
Standardized abnormal return:
Test statistic:
where is the estimation window length for firm .
| Pros | Cons |
|---|---|
| Accounts for different firm volatilities | Assumes no event-induced variance change |
| More powerful than CSect T for heterogeneous samples | Requires estimation-window variance estimates |
| Well-established in the literature | Over-rejects when variance increases at event |
Patell Z vs. CSect T
If firms in your sample have very different volatilities (e.g., mixing large-cap and small-cap), Patell Z is more appropriate because it standardizes by each firm's own variance.
BMP Test (Boehmer, Musumeci & Poulsen)
Addresses a key weakness of the Patell Z test: events often change the variance of returns (event-induced variance). The BMP test cross-sectionally standardizes the Patell-standardized AR, making it robust to variance changes at the event date.
where and is its cross-sectional standard deviation.
| Pros | Cons |
|---|---|
| Robust to event-induced variance | Requires estimation-window variance estimates |
| Correctly sized even during volatile events | Slightly less powerful than Patell Z when variance is stable |
| Recommended by Kolari & Pynnonen (2010) | More complex to interpret |
When variance changes matter
Earnings announcements, M&A, and crisis events typically increase return variance at the event date. The BMP test prevents the inflated variance from producing false positives. Use it alongside Patell Z as a robustness check.
Kolari-Pynnonen Test
An adjusted version of the BMP test that corrects for both event-induced variance and cross-sectional correlation (event clustering). When multiple events occur on the same calendar date, standardized abnormal returns are correlated across firms, causing the BMP test to over-reject. The Kolari-Pynnonen adjustment accounts for this.
where is the average cross-sectional correlation of the standardized abnormal returns across firms.
| Pros | Cons |
|---|---|
| Robust to event-induced variance and clustering | Requires estimation of cross-sectional correlation |
| Correctly sized for same-date events | Slightly less powerful than BMP when no clustering |
| Recommended for industry-wide event studies | Requires standardized residuals from estimation window |
When to use Kolari-Pynnonen
Use when events cluster on the same or nearby calendar dates (e.g., regulatory announcements, industry-wide shocks). It combines the strengths of the BMP test with a correction for cross-sectional dependence.
Calendar-Time Portfolio Test
When multiple events cluster on the same calendar dates, the cross-sectional independence assumption is violated. The Calendar-Time Portfolio test aggregates returns into a single time series of portfolio returns and tests using time-series standard errors.
where is the time-series standard deviation of the portfolio AAR.
| Pros | Cons |
|---|---|
| Handles temporal clustering of events | Less powerful with few calendar dates |
| Does not require cross-sectional independence | Sensitive to portfolio weighting |
| Better for industry-wide events | Uncommon in standard event studies |
Non-Parametric Tests
Non-parametric tests do not require normally distributed returns. According to Corrado and Zivney (1992), non-parametric tests can improve rejection accuracy by 15–30% relative to parametric alternatives when daily returns exhibit skewness or excess kurtosis. Use them when diagnostics show non-normal residuals (Shapiro-Wilk p < 0.05) or when your sample contains extreme outliers.
Sign Test
Tests whether the proportion of positive abnormal returns exceeds 50%. Under the null hypothesis of zero abnormal returns, positive and negative AR are equally likely.
where is the number of firms with positive AR at the event date.
| Pros | Cons |
|---|---|
| No distributional assumptions | Low power (ignores magnitude) |
| Robust to outliers | Assumes symmetric return distribution |
| Simple to interpret | Cannot detect effects driven by magnitude |
Generalized Sign Test (Cowan 1992)
Improves on the Sign test by estimating the expected proportion of positive returns from the estimation window, rather than assuming 50%.
| Pros | Cons |
|---|---|
| Accounts for return asymmetry | Still ignores magnitude |
| Better sized than simple Sign test | Requires estimation-window data |
| Robust to skewed distributions | Slightly more complex |
Rank Test (Corrado 1989)
Ranks all abnormal returns (estimation window + event window) and tests whether event-window ranks are systematically higher or lower than expected.
where is the mean centered rank at event time and is its standard deviation.
| Pros | Cons |
|---|---|
| Uses magnitude information (via ranks) | Less intuitive than t-tests |
| Robust to non-normality and outliers | Requires combining estimation + event data |
| Well-powered across distributions | Sensitive to ties in rankings |
| Handles event-induced variance |
Best non-parametric choice
The Rank test is generally the most powerful non-parametric test because it uses magnitude information (through ranks) while remaining robust to non-normality. Use it as your primary non-parametric robustness check.
Which Tests Should I Report?
- AAR (Average Abnormal Return)
- The cross-sectional mean of abnormal returns at a given event time, measuring the average market reaction across all firms in the sample.
- CAAR (Cumulative Average Abnormal Return)
- The sum of AARs over an event window, capturing the total average price impact of the event across the sample period.
- Standardized Abnormal Return (SAR)
- An abnormal return divided by its estimation-window standard deviation, used in the Patell Z and BMP tests to weight each firm inversely by its volatility.
Most published event studies report 2–3 tests. A robust reporting strategy:
| Scenario | Recommended Tests |
|---|---|
| Standard event study | CSect T + Patell Z + one non-parametric (Rank or Sign) |
| Event-induced variance likely | BMP + Rank test |
| Non-normal residuals | Sign + Generalized Sign + Rank |
| Clustered event dates | Kolari-Pynnonen + Calendar-Time Portfolio |
| Event-induced variance + clustering | Kolari-Pynnonen + Rank test |
| Maximum robustness | CSect T + Patell Z + BMP + Kolari-Pynnonen + Rank |
Literature
- Patell, J.M. (1976). Corporate forecasts of earnings per share and stock price behavior. Journal of Accounting Research, 14(2), 246–276.
- Boehmer, E., Musumeci, J. & Poulsen, A.B. (1991). Event-study methodology under conditions of event-induced variance. Journal of Financial Economics, 30(2), 253–272.
- Cowan, A.R. (1992). Nonparametric event study tests. Review of Quantitative Finance and Accounting, 2(4), 343–358.
- Corrado, C.J. (1989). A nonparametric test for abnormal security-price performance in event studies. Journal of Financial Economics, 23(2), 385–395.
- Campbell, J.Y., Lo, A.W. & MacKinlay, A.C. (1997). The Econometrics of Financial Markets.
- Kolari, J.W. & Pynnonen, S. (2010). Event Studies for Financial Research.
Run this in R
The EventStudy R package lets you run these calculations programmatically with full control over parameters.
What Should I Read Next?
- AR & CAR Test Statistics — single-event significance tests
- Test Statistics Overview — choosing the right test
- Assumptions — when tests break down
- Diagnostics & Export — validate model fit