Skip to Main Content

Health, Crime, and other Socioeconomic Data - Fall 2023

Health, Crime, and other Socio-Economic Microdata Examples

National Health Interview Survey Series
Years covered: 1970- . (1970-1975 entitled Health Interview Survey)
Scope: basic purpose is to obtain information about the amount and distribution of illness, its effects in terms of disability and chronic impairments, and the kinds of health services people receive. Information on the utilization of medical care facilities is also available in the form of data on medical and dental care, hospitalization, preventive care, nursing care, prosthetic appliances, and self-care. The Core variables are contained in the files for household, person, condition, doctor visit, and hospital data. Each year additional batteries of questions are asked which focus on specific topics. Supplemental NHIS data provide information on topics such as AIDS knowledge and attitudes, child health care and immunization, dental care, substance abuse, hospitalization, preventive care, nursing care, prosthetic appliances, and self-care. Supplements on Aging (SOA) conducted in 1984 and 1994 and the 1984-1990 Longitudinal Study of Aging (LSOA) were designed to furnish information on the causes and correlates of changes in the health and functioning of older Americans. Another component of the NHIS is the National Health Interview Survey on Disability (NHIS-D). Begun in 1994, the NHIS-D was designed to collect data that can be used to understand disability and to develop public policy on disability. Starting in 1997, the NHIS was redesigned to include a basic module, a periodic module, and a topical module. The basic module corresponds to the NHIS core questionnaire and is made up of the family core, the sample adult core, and the sample child core questions. The periodic module provides more detailed information on topics resulting from the basic module. The topical modules correspond to the supplements of the 1982-1996 NHIS and focus on public health data needs as they arise. Also see the Integrated Health Interview Series which has harmonized variables over time.  More recent microdata may be available directly on the NCHS site.
Sample Size and Makeup: representative sample of the civilian, noninstutionalized population of the USA
How segmented: type of living quarters, size of family, geographic region, age, sex, race, marital status, veteran status, education, income, industry, occupation codes, and limits on activity.
Where is the documentation: ICPSR site and NCHS site.
Questions: Does income status impact health service received? Does the patient have Medicare, Medicaid, private health insurance? Does race impact prevalence of certain illnesses? Do certain areas of the country have higher incidences of certain diseases? What variables (age, race, sex, education, etc) impact one's health? Does your occupation impact your health?
Major Studies that have used this data : ICPSR Related Publications Site.
Summary Data:  select summary data in Sage Data 


National Longitudinal Study of Adolescent Health, Waves I-V, 1994-2018 (Add Health)
Scope: longitudinal study of a nationally representative sample of adolescents in grades 7-12 in the United States during the 1994-1995 school year. The Add Health cohort has been followed into young adulthood with 4 in-home interviews, the most recent in 2018, when the sample was aged 33-43. Add Health re-interviewed cohort members in a Wave V follow-up from 2016-2018 to collect social, environmental, behavioral, and biological data with which to track the emergence of chronic disease as the cohort moves through their fourth decade of life. Add Health combines longitudinal survey data on respondents' social, economic, psychological and physical well-being with contextual data on the family, neighborhood, community, school, friendships, peer groups, and romantic relationships, providing unique opportunities to study how social environments and behaviors in adolescence are linked to health and achievement outcomes in young adulthood.
How segmented: age, race, Hispanic ethnicity, Asian ethnicity, language spoken at home, country of birth, marital status, religion.
Where is the documentation: AddHealth site and also on ICPSR site
Questions: Which behaviors promote health and which ones are detrimental to health? What is the influence on health of factors particular to the communities in which adolescents reside?
Major Studies that have used this data : AddHealth site.

Also see National Longitudinal Study of Adolescent to Adult Health (Add Health) Parent Study: Public Use, [United States], 2015-2017


Consumer Expenditure Survey (CES) Series (1960-1961, 1972-1973, 1980+) (formerly called the Survey of Consumer Expenditures)
Scope: provides a continuous flow of information on the buying habits of American consumers and also furnishes data to support periodic revisions of the Consumer Price Index. The unit of analysis for the Consumer Expenditure Surveys is the consumer unit, consisting of all members of a particular housing unit who are related by blood, marriage, adoption, or some other legal arrangement. Consumer unit determination for unrelated persons is based on financial independence.
How segmented: size of household, age, race, income, education.
Where is the documentation: ICPSR site
Questions: Do expenditure patterns differ based on ethnicity? How do other household expenditures impact health insurance choices? How do spikes in fuel and electricity prices impact other household expenditures? How does child support impact household spending?
Major Studies that have used this data : ICPSR Related Publications Site.
Summary Data:  select summary data in Sage Data 


Survey of Consumer Finances. 1947-1971, 1977, 1983+
Scope: Since 1983 conducted every 3 years, and collects information on the assets, liabilities and other financial characteristics of households. It is the only U.S. survey that contains an oversample of wealthy households. Lowest level of geography are 9 broad multi-state census regions.
Where is the documentation: Federal Reserve Board site and ICPSR site; Data is divided. ICPSR does not have recent years. The Federal Reserve Board does not have the early years.
Sample Size: About 4,500 families are interviewed in the main study.
Unit of Analysis: Most of the data in the survey are intended to represent the financial characteristics of a subset of the household unit referred to as the "primary economic unit" (PEU). In brief, the PEU consists of an economically dominant single individual or couple (married or living as partners) in a household and all other individuals in the household who are financially interdependent with that individual or couple. For example, in the case of a household composed of a married couple who own their home, a minor child, a dependent adult child, and a financially independent parent of one of the members of the couple, the PEU would be the couple and the two children. Summary information is collected at the end of the interview for all household members who are not included in the PEU. The only variables collected separately for the respondent and the spouse or partner of the respondent are those concerning employment, pension, and demographic characteristics.
Major Studies that have used this data : SCF Working Papers and ICPSR Related Publications Site.


Uniform Crime Reporting Program Data [United States] (1960+)
Scope: Periodic nationwide assessments of reported crimes.  The UCR program was subsequently expanded to capture incident-level data with the implementation of the National Incident-Based Reporting System. The NIBRS data focus on various aspects of a crime incident. Gathering of hate crime data by the UCR program was began in 1990. Hate crimes are defined as crimes that manifest evidence of prejudice based on race, religion, sexual orientation, or ethnicity. In September 1994, disabilities, both physical and mental, were added to the list. Also see Missing Data in the Uniform Crime Reports, 1977-2000. Select data is also contained in Data-Planet Statistical Datasets. Also available in an easy to use cumulative format.
Where is the documentation:  ICPSR site
Sample Size: Over 4,500 agencies reporting since 1998.
Unit of Analysis:

  1. Agency-Level UCR Data

The agency-level data, where the unit of analysis is the police agency (or incident for Supplementary Homicide Reports and Hate Crime) and contain information on:

  • Offenses known and clearances by arrest
  • Property stolen and recovered.  Aggregated at the agency-level, report on the nature of the crime, the monetary value of the property stolen, and the type of property stolen.
  • Supplementary Homicide Reports (SHR) provide incident-based information on criminal homicides. Contain information describing the victim(s) of the homicide, the offender(s), the relationship between victim and offender, the weapon used, and the circumstance of the incident.
  • Police Employee (LEOKA) Data provide information about Law Enforcement Officers Killed or Assaulted in the line of duty. Provide in-depth information on the circumstances surrounding killings or assaults, including type of call answered, type of weapon used, and type of patrol the officers were on.
  • Hate Crime Data includes number of victims and offenders involved in each hate crime incident, type of victims, bias motivation, offense type, and location type.
  • Arrests by age, sex, and race
  • Arson
  1. Incident-Level UCR Data

Incident-level data are collected through the National Incident-Based Reporting System (NIBRS) component of the UCR. NIBRS collects data on each single incident and arrest within 22 offense categories made up of 46 specific crimes called Group A offenses. For each of the offenses coming to the attention of law enforcement, specified types of facts about each crime are collected. In addition to the Group A offenses, there are 11 Group B offense categories for which only arrest data are reported. More information about this study can be found in the NIBRS Resource Guide.  Note not all states participate in the program.

County-Level UCR Data

The county-level Uniform Crime Reporting files contain only arrests and crimes reported data and are distributed annually as 4 separate data files:

  • Arrests, All Ages
  • Arrests, Adults
  • Arrests, Juveniles
  • Crimes Reported

Beginning in 1993, an Allocated Statewide Data file is also distributed for each part above. The Statewide data files provide the amount of data from statewide agencies allocated to each county based on the county's share of the state population. These Statewide data files can also be used to "back out" the statewide counts if only the county total is desired.

Major Studies that have used this data :  ICPSR Related Publications Site (UCR) and ICPSR Related Publications Site (NIBRS)