Research Guides: Reshape in R: Long/Wide format

Introduction of wide and long data

A dataset can be written in two different formats: wide and long.

A wide format has values that do not repeat in the first column.

A long format has values that do repeat in the first column.

Example of wide format:

Student	Math	Literature	PE
A	99	45	56
B	73	78	55
C	12	96	57

Example of long format:

Student	Subject	Score
A	Math	99
A	Literature	45
A	PE	56
B	Math	73
B	Literature	78
B	PE	55
C	Math	12
C	Literature	96
C	PE	57

We can see that in the wide format there is no repetitive value in the first column.

Sometimes when we download the datasets of interest from the website, they are not necessarily ready for statistical analysis. Thus, we will see how to transform between these two formats in R.

Reshape wide to long

First we load the data.

rw<-read.csv("https://dss.princeton.edu/training/widetolong.csv")

This data is in the wide format.

Now we reshape this dataset.

data1= reshape(data = rw,
             idvar= "Country.Name",
             varying = 2:11, #We need to specify here the columns to be reshaped
             sep= "",
             timevar= "year",
             times = c(2017,2018,2019,2020,2021),
             new.row.names= 1:10000,
             direction = "long")

Now we can see it's in the long format now.

Reshape long to wide

Now we load this dataset in the long format.

rl<-read.csv("https://dss.princeton.edu/training/longtowide.csv")

#data source: World Bank (WDI) 2007-2021

rl.wide= reshape(data = rl,
                    idvar= "year",
                    v.names= c("GDP"),
                    timevar= "country",
                    direction = "wide")

Now we can see that it has been transformed into a wide format.