Research Guides: Time-Series Analysis in R: Event study

Time-Series Analysis Basics

Converting into date variables

There are some different ways of representing date and we need to convert them into date variable, so that R understands they are dates.

We first read the file into R.

mydata<-read.csv("https://dss.princeton.edu/training/mydata_date.csv")

Source:https://www.statmethods.net/input/dates.html

We can see that the four columns are representing the same date, with different types of formatting. The first two columns are characters written in different ways, and the last two are intergers.

Now we will show how to convert these four variables into date variables correspondingly.

converting the string 'date1' into a date variable called 'new.date1':

mydata$new.date1 = as.Date(mydata$date1, "%d-%b-%y")

converting the string 'date2' into a date variable called 'new.date2':

mydata$new.date2 = as.Date(mydata$date2, "%m/%d/%Y")

converting the string 'date3' into a date variable called 'new.date3':

mydata$new.date3 = as.Date(as.character(mydata$date3), "%Y%m%d")
#note that in this case we have to convert the integer into character first

converting the string 'date4' into a date variable called 'new.date4':

mydata$len.date4 = nchar(mydata$date4) # Need to identify the length first
mydata$date4 = as.character(mydata$date4) # Need to convert to character
mydata$date4b = ifelse(mydata$len.date4==6, paste0(substr(mydata$date4,1,4),0,
                                                   substr(mydata$date4,5,5),0,
                                                   substr(mydata$date4,6,6)),
                       paste0(substr(mydata$date4,1,4),0,
                              substr(mydata$date4,5,5),
                              substr(mydata$date4,6,7)))
mydata$new.date4 = as.Date(mydata$date4b, "%Y%m%d")

Now see the conversion results.

Now we can see the new.date1-4 are date variables.

Note that it's easy to convert them back into strings:

mydata$string.date1 = as.character(mydata$new.date1)

Extracting year, month and day using base functions

mydata$date<-mydata$new.date1
# Extracting year (from a variable in date format)
mydata$year = as.numeric(format(mydata$date, "%Y"))
# Extracting month (from a variable in date format)
mydata$month = as.numeric(format(mydata$date, "%m"))
# Extracting day (from a variable in date format)
mydata$day = as.numeric(format(mydata$date, "%d"))

Lags and forwards (leads)

# Getting the sample data
usa = read.csv("http://dss.princeton.edu/training/us.csv", header=TRUE)
# Lag 1 of ‘gdppcgr’, see variable ‘l1.gdp’ below
usa$l1.gdp <- c(NA,usa$gdppcgr[1:nrow(usa)-1])
# Forward 1 of ‘gdppcgr’, see variable ‘f1.gdp’ below
usa$f1.gdp <- c(usa$gdppcgr[2:nrow(usa)],NA

Lag and forward variables in panel data

# Creating a dataset
set.seed(12345)
mydata = data.frame(country = rep(toupper(letters[1:3]), each=5),
                    year = rep(2000:2004,3),
                    var1 = rnorm(15))
# Function to get the lags
lag = function(x) c(NA,x[1:(length(x)-1)])
# Getting the lags in the data
mydata$lag.var1 = ave(mydata$var1, mydata$country, FUN=lag)

# or using plm package to get the lag:
library(plm)
mydata = pdata.frame(mydata, index = c("country", "year"))
mydata$lag.var1 = lag(mydata$var1)

# Function to get the forward or lead values
lead = function(x) c(x[2:length(x)],NA)
# Getting the forward/leads in the data
mydata$lead.var1 = ave(mydata$var1, mydata$country, FUN=lead)

Replacing missing values with previous non-missing

We often have to deal with datasets have missing data. For whatever reason the data is missing on some certain dates, one of the ways of filling the NA value is to use the most recent previous non-missing.

# Creating a dataset
set.seed(12345)
mydata = data.frame(country = rep(toupper(letters[1:3]), each=5),
                    year = rep(2000:2004,3),
                    var1 = rnorm(15))
mydata$var1 = ifelse(mydata$year<2003,mydata$var1,NA)
# Replacing missing values with previous non-missing
library(zoo)
mydata$var2 <- na.locf(mydata$var1)
mydata

Rolling sum in panel data

# Creating a dataset
set.seed(12345)
mydata = data.frame(country = rep(toupper(letters[1:3]), each=5),
                    year = rep(2000:2004,3),
                    var1 = rnorm(15))
# Sort data by country and year
mydata = mydata[ order(mydata$country, mydata$year), ]
# Rolling sum every four years
library(zoo)
rolsum = function(x) rollapply(x, 4, sum, na.rm=TRUE, fill = NA, align = "right")
mydata$sum = ave(mydata$var1, mydata$country, FUN=rolsum)
mydata

Event Study Example

We can download the 'eventstudies' package in R to conduct event study analysis. Note that you cannot directly download the package. You have to download the 'githubinstall' package first and then download the eventstudies package through github download.

install.packages('digest')
install.packages('githubinstall')
library(githubinstall)
githubinstall('eventstudies',force=TRUE)
library(eventstudies)

After that, we will call our data.

data('StockPriceReturns')
data('SplitDates')
data('OtherReturns')

The overview of the data:

The StockPriceReturn dataset gives the information about several companies' stock return along with the date, the 'OtherReturns' are market benchmark stockprices, and the 'SplitDate' dataset tells us the event happening time.

Now we can do the eventstudy.

es <-eventstudy(firm.returns = StockPriceReturns,
                event.list = SplitDates,
                event.window = 7,
                type = 'None', #We are using 'None' as the type here.
                to.remap = TRUE,
                remap = 'cumsum',
                inference = TRUE,
                inference.strategy = 'bootstrap')
plot(es)

We can notice that there is a price drop at the time of the event. Also, after 3 days of the event, it seems that the stock price has a trend of going back.

Now, we use the market model.

es1 <-eventstudy(firm.returns = StockPriceReturns,
                event.list = SplitDates,
                event.window = 7,
                type = 'marketModel', #Here we use the market model
                to.remap = TRUE,
                remap = 'cumsum',
                inference = TRUE,
                inference.strategy = 'bootstrap',
                model.args = list(
                  market.returns = OtherReturns[,'NiftyIndex'] #Here we specify what benchmark we will use
                ))
plot(es1)

Now we can see that using the market model, the event of split does not influence the stock price very much.

We can also use augmented market model to allow for more flexibility.

es2 <-eventstudy(firm.returns = StockPriceReturns,
                 event.list = SplitDates,
                 event.window = 7,
                 type = 'marketModel',
                 to.remap = TRUE,
                 remap = 'cumsum',
                 inference = TRUE,
                 inference.strategy = 'bootstrap',
                 model.args = list(
                   market.returns = OtherReturns[,'NiftyIndex'],
                   Others = OtherReturns[,'USDINR'],
                   market.returns.purge=TRUE,
                   nlag.makex = 5, #We can manually set the lag time
                   nlag.lmAMM = 5
                 ))
plot(es2)

We can see that this time, the plot resembles the one from the market model very much. Both indicating the market 'recovered' from the split really soon.

Also, the eventstudies package allows you to convert the returns to cumulative returns.

es <- phys2eventtime(z=StockPriceReturns, events=SplitDates, width=10)
es.w <- window(es$z.e, start=-10, end=10)
es.cs <- remap.cumsum(es.w,is.pc=FALSE,base=0)
es.cs