Data are usually available either in "long" format or in "wide" format. This tutorial discusses how to convert long format data to wide format, and wide format data to long format with the help of reshape command available in Stata.
Wide format data:
Wide-form data are organized by the group identifier (e.g., individual, state, country), and all observations on a particular identifier are usually stored in a single row.
Example
country gdp2020 gdp2021 trade2020 trade2021
A 80 85 21 23
B 35 40 10 11
C 75 78 23 27
Notice that in the above wide form data, observations for each identifier (i.e., country) are presented in a single row.
Long format data:
Long-form data are organized by the within-group identifier (e.g., year, month, date) , storing the observations for each identifier in multiple rows.
Example
country year gdp trade
A 2020 80 21
A 2021 85 23
B 2020 35 10
B 2021 40 11
C 2020 75 23
C 2021 78 27
Notice that in the above long form data, observations for each identifier (i.e., country) are presented according to the within-group identifier (i.e., year) and stored in multiple rows.
Remember, for panel data analysis, we need data in long format.
Let's convert wide format data to long format.
Example 1
- Load the following dataset
use https://dss.princeton.edu/training/widelong-1.dta
Part of the loaded wide format dataset looks like this
- To convert the above wide form data to long form, type:
reshape long gdp trade, i(country) j(year)
Notes:
After reshape long,type the name of the variable(s) - here we have two variables in the dataset (e.g., gdp and trade)
i indicates the group identifier (e.g., individual, state, country, or any other entity).
j indicates the within-group identifier (e.g., year, month, date).
- Stata will give you the following message to show how it has converted the data from wide to long format
- The converted long form dataset looks like this
Example 2
- Load the following foreign aid dataset
use https://dss.princeton.edu/training/widelong-2.dta
The loaded wide format dataset looks like this
Notice that the first row of the dataset does not contain any substantive variable name (they are named as A, B, C ...).
First, let's assign a name for the "id". We'll do this by renaming column A as recipient. Type:
Second, let's assign a variable name by typing aid before each year:
After renaming the variables, the dataset looks like this:
Now the dataset is ready for reshaping. Type the following codes:
reshape long aid, i(recipient) j(year)
- Stata will give you the following message to show how it has converted the data from wide to long format
- The converted long form dataset looks like this
- If you type summarize command to get the summary statistics, you will find blank entries for aid.
- To solve the issue, we need to destring the aid variable. Type:
destring aid, replace ignore("..")
Notice that the resulting summary statistics now contain information about the aid variable
Let's convert long format data to wide format.
Example 1
- Load the following dataset
use https://dss.princeton.edu/training/longwide-1.dta
The loaded long form dataset looks like this
- To convert the above long form data to wide form, type:
reshape wide gdp, i(country) j(year)
Notes:
- Stata will give you the following message to show how it has converted the data from long to wide format
The converted dataset looks like this
Example 2
- Load the following dataset
use https://dss.princeton.edu/training/longwide-2.dta
The dataset looks like this
- To convert the dataset to wide format, we first need to change the date variable. Type the following Stata codes:
- The data will look like this
- To convert the above long form data to wide form, type:
reshape wide return interest, i(id) j(date) str
Notes:
- Stata will give you the following message to show how it has converted the data from long to wide format
- The converted long form dataset looks like this
Notes:
- Type help reshape in Stata for more help.
- If you have a very large dataset, and you want to restructure it more efficiently, follow the instructions provided here.
DSS Data Analysis Guides. Available at: https://libguides.princeton.edu/c.php?g=1415215
Princeton DSS Libguides https://libguides.princeton.edu/dss
Stata https://www.stata.com/manuals13/dreshape.pdf
UCLA_1 https://stats.oarc.ucla.edu/stata/modules/reshaping-data-wide-to-long/
UCLA_2 https://stats.oarc.ucla.edu/stata/modules/reshaping-data-long-to-wide/
U Virginia https://data.library.virginia.edu/stata-basics-reshape-data/
World Development Indicators (World Bank): https://databank.worldbank.org/source/world-development-indicators#
If you have questions or comments about this guide or method, please email data@Princeton.edu.