What data format does the EventStudy R package require?

The package requires three data frames: firm stock prices (symbol, date, adjusted closing price), a market index with the same structure, and an event request table specifying event IDs, firm and index symbols, event dates, and window parameters. Dates must be in DD.MM.YYYY format.

How should I handle missing stock price data in an event study?

Missing data in the estimation window reduces the number of observations for model fitting. If more than 20% of estimation window observations are missing, exclude that event. Missing data in the event window creates gaps in abnormal returns and should generally lead to event exclusion.

What date format does the EventStudy R package use?

The EventStudy R package uses the DD.MM.YYYY date format (e.g., 03.01.2020 for January 3, 2020). Dates must be trading days -- weekends and holidays are excluded automatically.

Data Preparation for Event Studies

Q: Can I use unadjusted stock prices for an event study?

No. You must use adjusted closing prices that account for stock splits, dividends, and other corporate actions. Unadjusted prices create artificial return spikes that contaminate the estimation and event windows.

Correct data preparation is the foundation of a reliable event study. The EventStudy R package requires three input datasets: firm stock prices, a market index, and an event request table. Each must follow a specific format. This page covers the required data structures, date formatting, how to handle missing observations, and quality checks you should perform before running an analysis.

Part of the Methodology Guide

This page is part of the Event Study Methodology Guide.

Data quality determines the reliability of every downstream calculation. A 250-day estimation window captures approximately 12 months of trading data and requires at least 250 non-missing price observations per firm. US equity markets have roughly 252 trading days per year, making this the standard benchmark for estimation period length in daily event studies.

What Data Do I Need?

Data preparation for event studies is the process of assembling, formatting, and validating the financial time-series data required to estimate expected returns and compute abnormal returns. Proper preparation typically accounts for 60-70% of the total effort in an event study and is the single most important determinant of result reliability.

Every event study requires three datasets. The firm stock prices and market index provide the return data used to estimate the expected return model and compute abnormal returns. The event request table defines which events to analyze and how.

Dataset	Purpose	Required Columns	Rows
Firm stock prices	Daily adjusted closing prices for each firm	symbol, date, adjusted	One row per firm per trading day
Market index	Benchmark return data (e.g., S&P 500)	symbol, date, adjusted	One row per trading day
Event request table	Defines events and window parameters	event_id, firm_symbol, index_symbol, event_date, group, window params	One row per event

Use adjusted closing prices

Always use adjusted closing prices that account for stock splits, dividends, and other corporate actions. Unadjusted prices create artificial return spikes on ex-dividend and split dates, which contaminate both the estimation and event windows.

Firm Stock Price Data

The firm stock price data frame contains daily adjusted closing prices for each firm in your study. It must have exactly three columns:

Column	Type	Description	Example
symbol	Character	Unique firm identifier (ticker or ISIN)	AAPL
date	Character	Trading date in DD.MM.YYYY format	03.01.2020
adjusted	Numeric	Adjusted closing price	102.50

The data must span the entire period needed for each event: the estimation window plus the event window. For example, if you use a 250-day estimation window with a 10-day gap and a [-5, +5] event window, you need at least 270 trading days of data before and 5 days after each event date. According to Campbell, Lo, and MacKinlay (1997), a minimum of 120 trading days in the estimation window is necessary for reliable parameter estimates in the Market Model.

Prepare firm stock price data

library(EventStudy)
library(tidyverse)

# Load price data (e.g., from Yahoo Finance via tidyquant)
# library(tidyquant)
# raw_prices <- tq_get(c("AAPL", "MSFT", "GOOGL"),
#                       from = "2018-01-01", to = "2023-12-31")

# Format for the EventStudy package
firm_data <- raw_prices |>
  transmute(
    symbol = symbol,
    date = format(date, "%d.%m.%Y"),
    adjusted = adjusted
  )

head(firm_data)
# # A tibble: 6 x 3
#   symbol date       adjusted
#   <chr>  <chr>         <dbl>
# 1 AAPL   02.01.2018     172.
# 2 AAPL   03.01.2018     172.
# 3 AAPL   04.01.2018     173.
# 4 AAPL   05.01.2018     175.
# 5 AAPL   08.01.2018     174.
# 6 AAPL   09.01.2018     174.

Data sources

Common sources for adjusted stock prices include Yahoo Finance (via the tidyquant or quantmod R packages), Bloomberg, Refinitiv (formerly Thomson Reuters), and CRSP for academic research.

Market Index Data

The market index data frame has the same three-column structure as the firm data. It provides the benchmark return for the expected return model (e.g., the Market Model). Choose an index that represents the market in which your firms trade.

Market	Common Index	Symbol Example
US equities	S&P 500, CRSP Value-Weighted	SP500
European equities	STOXX Europe 600, DAX	STOXX600
UK equities	FTSE 100	FTSE100
Japanese equities	Nikkei 225, TOPIX	TOPIX
Emerging markets	MSCI Emerging Markets	MSCIEM

Prepare market index data

# Market index data (same structure as firm data)
index_data <- sp500_prices |>
  transmute(
    symbol = "SP500",
    date = format(date, "%d.%m.%Y"),
    adjusted = adjusted
  )

# Ensure the index covers the same date range as firm data
stopifnot(min(index_data$date) <= min(firm_data$date))
stopifnot(max(index_data$date) >= max(firm_data$date))

Index must match firm dates

The market index must cover at least the same date range as the firm data. If the index has trading days that a firm does not (e.g., the firm was not yet listed), the package handles this automatically. But if the index data is shorter than the required estimation + event window, the event will fail.

Event Request Table

The event request table defines each event you want to analyze. Each row represents one event: a specific firm experiencing a specific event on a specific date.

Column	Type	Description	Example
event_id	Integer	Unique identifier for the event	1
firm_symbol	Character	Must match a symbol in firm data	AAPL
index_symbol	Character	Must match a symbol in index data	SP500
event_date	Character	Date of the event in DD.MM.YYYY	15.03.2022
group	Character	Grouping variable for cross-sectional analysis	Tech
event_window_start	Integer	Start of event window relative to event date	-5
event_window_end	Integer	End of event window relative to event date	5
shift_estimation_window	Integer	Gap between estimation and event windows	-10
estimation_window_length	Integer	Number of trading days in estimation window	250

Create event request table

# Define events
request <- tibble(
  event_id = 1:3,
  firm_symbol = c("AAPL", "MSFT", "GOOGL"),
  index_symbol = "SP500",
  event_date = c("15.03.2022", "20.04.2022", "10.05.2022"),
  group = c("Tech", "Tech", "Tech"),
  event_window_start = -5,
  event_window_end = 5,
  shift_estimation_window = -10,
  estimation_window_length = 250
)

# The estimation window will be:
# [event_date + shift - estimation_window_length,
#  event_date + shift]
# i.e., 250 days ending 10 days before the event

The shift_estimation_window parameter creates a gap between the end of the estimation window and the start of the event window. This gap prevents the event from contaminating the estimation of normal returns. A typical gap is 10 trading days. For more on window selection, see Event Window & Estimation Window Selection.

Date Formatting

The EventStudy package uses the DD.MM.YYYY format for all dates. This applies to both the price data and the event request table. Dates must represent trading days — weekends and public holidays are not valid event dates.

Date formatting examples

# Convert from Date object to DD.MM.YYYY string
date_obj <- as.Date("2022-03-15")
formatted <- format(date_obj, "%d.%m.%Y")
# "15.03.2022"

# Convert from common formats
# From YYYY-MM-DD (ISO)
as.Date("2022-03-15") |> format("%d.%m.%Y")

# From MM/DD/YYYY (US)
as.Date("03/15/2022", format = "%m/%d/%Y") |> format("%d.%m.%Y")

# From YYYYMMDD (compact)
as.Date("20220315", format = "%Y%m%d") |> format("%d.%m.%Y")

# Batch convert a data frame column
df$date <- format(as.Date(df$date), "%d.%m.%Y")

Common date mistakes

The most frequent data preparation error is incorrect date formatting. Ensure you use periods as separators (not hyphens or slashes) and the day-month-year order (not month-day-year). A date like "03.01.2020" means January 3, 2020 — not March 1.

Handling Missing Data

Missing data arises from trading halts, newly listed firms, delistings, and data provider gaps. Studies show that approximately 5-15% of events in a typical sample are affected by missing data issues. How you handle missing observations depends on where they appear.

Location	Impact	Recommended Action
Estimation window (few days)	Slightly fewer observations for model fitting	Usually acceptable; the package handles this automatically
Estimation window (>20% missing)	Unreliable model parameter estimates	Exclude the event from the study
Event window	Gap in abnormal returns; cumulative AR is unreliable	Exclude the event or use the last available price (carry forward)
Market index	Cannot compute market return for that day	Fill from alternative source or exclude affected days

Check and handle missing data

# Check for missing values
sum(is.na(firm_data$adjusted))

# Check for gaps in trading dates per firm
firm_data |>
  group_by(symbol) |>
  summarise(
    n_days = n(),
    first_date = min(date),
    last_date = max(date),
    n_missing = sum(is.na(adjusted))
  )

# Remove firms with insufficient data
min_required_days <- 250 + 10 + 5  # estimation + gap + event window
firm_data <- firm_data |>
  group_by(symbol) |>
  filter(n() >= min_required_days) |>
  ungroup()

# Optional: forward-fill isolated missing prices
# (use with caution -- this assumes zero return on missing days)
firm_data <- firm_data |>
  group_by(symbol) |>
  arrange(date) |>
  fill(adjusted, .direction = "down") |>
  ungroup()

Data Quality Checks

Before running an event study, perform the following quality checks to catch common data issues that can invalidate results.

Check	What to Look For	How to Fix
Duplicate rows	Same symbol + date appears more than once	distinct(symbol, date, .keep_all = TRUE)
Zero or negative prices	Adjusted price <= 0	Remove or replace from alternative source
Extreme returns	Daily return > 100% or < -90%	Verify against source; likely a data error or split not adjusted
Date alignment	Firm and index dates do not overlap	Ensure both cover the same trading calendar
Sufficient history	Fewer days than estimation_window_length + gap + event_window	Extend data range or shorten estimation window
Event date is a trading day	Event date falls on weekend or holiday	Shift to next trading day or verify against trading calendar

Complete data quality check

# 1. Remove duplicates
firm_data <- firm_data |> distinct(symbol, date, .keep_all = TRUE)
index_data <- index_data |> distinct(symbol, date, .keep_all = TRUE)

# 2. Check for zero or negative prices
stopifnot(all(firm_data$adjusted > 0, na.rm = TRUE))
stopifnot(all(index_data$adjusted > 0, na.rm = TRUE))

# 3. Check for extreme returns (flag but don't auto-remove)
firm_returns <- firm_data |>
  group_by(symbol) |>
  arrange(date) |>
  mutate(ret = adjusted / lag(adjusted) - 1) |>
  ungroup()

extreme <- firm_returns |> filter(abs(ret) > 1)
if (nrow(extreme) > 0) {
  warning(paste(nrow(extreme), "extreme returns detected. Verify data."))
  print(extreme)
}

# 4. Check that event dates exist in firm data
event_dates_valid <- request |>
  rowwise() |>
  mutate(
    date_in_firm = event_date %in%
      (firm_data |> filter(symbol == firm_symbol) |> pull(date)),
    date_in_index = event_date %in%
      (index_data |> filter(symbol == index_symbol) |> pull(date))
  )

invalid_events <- event_dates_valid |>
  filter(!date_in_firm | !date_in_index)

if (nrow(invalid_events) > 0) {
  warning("Some event dates are not trading days. Adjust or remove:")
  print(invalid_events)
}

Automate quality checks

Wrap these checks in a function and run them every time you prepare data. This prevents subtle data errors from silently producing unreliable results.

Complete Example: From Raw Data to Event Study

Here is a full workflow that downloads data, prepares it, runs quality checks, and executes an event study.

End-to-end data preparation and event study

library(EventStudy)
library(tidyverse)

# --- Step 1: Prepare firm prices ---
firm_data <- tibble(
  symbol = rep(c("AAPL", "MSFT"), each = 300),
  date = rep(format(
    seq(as.Date("2021-01-04"), by = "day", length.out = 430) |>
      ((d) d[!weekdays(d) %in% c("Saturday", "Sunday")])() |>
      head(300),
    "%d.%m.%Y"
  ), 2),
  adjusted = c(
    cumprod(1 + rnorm(300, 0.0005, 0.015)) * 130,
    cumprod(1 + rnorm(300, 0.0004, 0.012)) * 240
  )
)

# --- Step 2: Prepare market index ---
index_data <- tibble(
  symbol = "SP500",
  date = unique(firm_data$date),
  adjusted = cumprod(1 + rnorm(300, 0.0003, 0.010)) * 4200
)

# --- Step 3: Create event request ---
request <- tibble(
  event_id = 1:2,
  firm_symbol = c("AAPL", "MSFT"),
  index_symbol = "SP500",
  event_date = firm_data |>
    filter(symbol == "AAPL") |>
    slice(260) |>
    pull(date),
  group = "Tech",
  event_window_start = -5,
  event_window_end = 5,
  shift_estimation_window = -10,
  estimation_window_length = 240
)

# --- Step 4: Quality checks ---
stopifnot(all(firm_data$adjusted > 0))
stopifnot(nrow(distinct(firm_data, symbol, date)) == nrow(firm_data))

# --- Step 5: Run event study ---
task <- EventStudyTask$new(firm_data, index_data, request)
ps <- ParameterSet$new()
task <- run_event_study(task, ps)

# --- Step 6: View results ---
task$get_aar()

Adjusted Closing Price: A stock's closing price retroactively modified to account for corporate actions such as stock splits, dividends, and rights offerings. Using unadjusted prices introduces artificial return spikes that bias both estimation and event windows.
Trading Day: A day on which the stock exchange is open for trading. US equity markets have approximately 252 trading days per year. Event study windows are measured in trading days, not calendar days.
Event Request Table: A structured data frame specifying each event to be analyzed, including the firm identifier, market index, event date, and window parameters. Each row represents one firm-event observation.

Run this in R

The EventStudy R package lets you run these calculations programmatically with full control over parameters.

Get the R Package Or try the App

What Should I Do Next?

Event Window & Estimation Window Selection — choose appropriate window lengths for your study
Expected Return Models — select the right benchmark model
Return Calculation — simple vs. log returns
Diagnostics & Export — validate your results and export for publication