We use multilevel or mixed-effects models (also known as hierarchical models) when the data is grouped, structured, or nested in multiple levels. Mixed-effects models consist of fixed effects (coefficients that do not vary by group) and random effects (coefficients that vary by group).
Examples of grouped or nested data:
Multilevel models allow:
Some advantages:
Pitfalls of ignoring multilevel data structure (Finch et al., 2019, page 29):
Stata has the option to estimate varieties of mixed-effects models, including linear mixed-effects models, generalized linear mixed-effects models, survival mixed-effects models, and nonlinear mixed-effects models. See this page for all the available options.
This tutorial provides step-by-step guides to estimate linear mixed-effects models using Stata.
This section provides a step-by-step guide to conducting multilevel analysis using cross-sectional data*.
First, get the data. Type the following codes on the Stata command window:
use "https://dss.princeton.edu/training/multilevel_example1.dta"
The dataset contains two levels: individuals (Level 1) and countries (Level 2). Here, individuals at Level 1 are nested within countries at Level 2.
The dataset contains the following variables:
Identifier variables:
individualID: level 1 identifier.
country: level 2 identifier. The dataset contains data for 20 countries, which indicates we have 20 groups in it.
Level 1 dependent variable:
income: individuals' annual incomes (log, international dollar).
Level 1 predictors:
education: respondents' level of education (binary, assigned 1 if completed secondary education and 0 otherwise).
age: age of the respondents.
gender: respondents' gender (binary, assigned 1 if female and 0 otherwise).
Level 2 predictors:
HDI: country-level human development index score.
demScore: country-level democracy score.
Let's now estimate the following basic multilevel/mixed effects models.
1. Varying-Intercept Model with no Predictors (Null Model)
In a varying-intercept model, the intercept (constant term) of the regression model is allowed to vary across different groups or clusters. For instance, if we have data on individuals from different countries, a varying-intercept model allows the intercept to vary for each country, capturing the fact that individuals from different countries may have different baseline levels of the outcome variable.
We estimate the varying-intercept model with no predictors (null model) to determine whether our data show evidence of clustering or group effects. If we find evidence that the dataset contains clustering/grouping effects, we should use a multilevel modeling approach. If we do not find any evidence of clustering effects in the dataset, we should estimate single-level models.
To estimate the null model, type:
mixed income || country:, mle
Here,
mixed: is the Stata command for estimating a multilevel model.
income: is the dependent variable.
|| country: this part specifies the random-effects structure of the model. In this case, country is a grouping variable (here, level 2 identifier variable) for which random intercepts are included.
, mle: specifies that the Maximum Likelihood Estimation (MLE) method should be used for estimating the model parameters.
Stata will give us the following outputs:
Interpretation
The above output table includes fixed effects estimates, random effects variance components, and other model diagnostic statistics.
- In the fixed effects part, the coefficient for _cons states that the grand mean of country level intercepts is 8.733811,and this is statistically significant as p-value is 0.00. In other words, the grand mean income of 20 countries in the dataset is 8.73 (log-scaled).
- In the random effects part, the variance of the random intercepts for the countries (var(_cons)) is 0.6593976 with a standard error of 0.2088699. This suggests substantial variation in income across countries, which is significantly different from zero as the confidence interval does not include zero (0.354423 to 1.226797).
- The residual variance (var(Residual)) is 0.9685608 with a standard error of 0.0096603, indicating the variability in income within countries that is not explained by the model.
- The likelihood ratio test (LR test) tests the null hypothesis of single level model (i.e., H0: var(_cons) = 0) against a multilevel model. The above table gives a LR test chi-square statistic of 9408.90, with a p-value of 0.0000, indicating that the multilevel model provides a significantly better fit to the data than a single level model without random effects.
In sum, the above null mixed-effects model indicates that the average income (intercept) is 8.733811, with significant variability both between and within countries. The inclusion of random intercepts for countries is justified by the significant LR test, highlighting the importance of accounting for country-level effects in the model.
Although the LR test can suggest whether we should use mixed effects model or a single level model, we can also calculate the intraclass correlation coefficient to determine it. The intraclass correlation coefficient measures the degree of similarity among observations within the same group or cluster.
To obtain the intraclass correlation coefficient, type:
estat icc
Stata will give us the following result:
Decision rule:
If the intraclass correlation coefficient (ICC) approaches 0 then the grouping by entities (here counties) are of no use (so you may just run a single level model). If the ICC approaches 1 then there is no variance to explain at the individual level, everybody is the same.
Here the ICC is greater than 0, which is .4050457 suggesting that we should use multilevel model instead of single level model for this dataset.
Note: the ICC is the ratio of level 2 variance to the sum of level 1 and level 2 variance estimates. We can thus calculate it manually by plugging in the respective values from the null model output table as follows:
2. Varying-intercept Model with Level 1 Predictors
From the LR test above and from the ICC value, we now know that our dataset requires estimating multilevel/mixed effects models. Let's now expand our model by adding all the available level 1 predictors. Run the following codes in Stata:
mixed income education age gender || country:, mle
Stata will give us the following outputs:
Interpretation
The above output table includes fixed effects estimates, random effects variance components, and other model diagnostic statistics.
Fixed Effects Estimates:
education: Completing secondary education (coded as 1) is associated with an approximate 0.552 unit increase in the logarithm of income compared to those who have not completed secondary education (coded as 0). Given that income is log-transformed, this coefficient corresponds to a 73.7% increase in income, calculated as (e0.552−1)×100%. This indicates that individuals with secondary education earn approximately 73.7% more than their counterparts without this education level, holding other factors constant. This effect is statistically significant at 1% level as the p-value is < 0.0001. This suggests a robust and substantial impact of secondary education on income levels.
age: Each additional year of age is associated with a 0.0021 unit increase in the logarithm of income, which is statistically significant (p < 0.0001). This indicates that income tends to grow exponentially by about 0.21% per additional year of age, assuming other variables are constant.
gender: The coefficient for gender is -0.0905, indicating that females (coded as 1) earn less than males (coded as 0), holding all other factors constant. Given that income is log-transformed, this coefficient corresponds to an 8.6% lower income for females compared to males, calculated as (e−0.0905−1)×100%. In other words, females earn approximately 8.6% less than males, holding all other variables constant. This result is statistically significant at the 1% level as p < 0.0001.
Random Effects Estimates:
var(_cons): The variance of the random intercepts for countries is 0.6171358 with a standard error of 0.1954933 suggests that there is substantial variation in income across countries. This is statistically significant, as the confidence interval ranging from 0.3316968 to 1.148207 does not include zero.
var(Residual): The residual variance within countries is 0.8969, indicating the variation in income within countries that is not explained by the included variables (education, age, and gender). This high variance suggests that there remains a notable amount of variation in income within each country, even after accounting for three individual-level variables.
Model Diagnostic Statistics:
Wald Test: The Wald chi-square test tests the null hypothesis that the fixed effects are simultaneously zero. The chi-square statistic of 1507.06 with 3 degrees of freedom and a p-value of 0.0000 suggests that the model's predictors (i.e., education, age, and gender) significantly help in explaining the variability in income.
Likelihood Ratio Test: The chi-square statistic of 9119.20 with a p-value of 0.0000 when comparing against a single level model indicates that incorporating random effects (variance across countries) significantly improves the model's fit.
3. Varying-intercept Model with Level 1 and Level 2 Predictors
Let us now add the level 2 variables (HDI and demScore) in our model. Use the following codes:
mixed income education age gender HDI demScore || country:, mle
Stata will give us the following outputs:
Interpretation
The above output table includes fixed effects estimates, random effects variance components, and other model diagnostic statistics.
Fixed Effects Estimates:
education: Completing secondary education (coded as 1) is associated with an approximate 0.5514 unit increase in the logarithm of income compared to those who have not completed secondary education (coded as 0). Given that income is log-transformed, this coefficient corresponds to a 73.57% increase in income, calculated as (e0.5514−1)×100%.This indicates that individuals with secondary education earn approximately 73.57% more than their counterparts without this education level, holding other factors constant. This effect is statistically significant at 1% level as the p-value is < 0.0001. This suggests a robust and substantial impact of secondary education on income levels.
age: Each additional year of age is associated with a 0.0020 unit increase in the logarithm of income, which is statistically significant (p < 0.0001). This indicates that income tends to grow exponentially by about 0.20% per additional year of age, assuming other variables are constant.
gender: The coefficient for gender is -0.0906, indicating that females (coded as 1) earn less than males (coded as 0), holding all other factors constant. Given that income is log-transformed, this coefficient corresponds to an 8.66% lower income for females compared to males, calculated as (e−0.0906−1)×100%. In other words, females earn approximately 8.66% less than males, holding all other variables constant. This result is statistically significant at the 1% level as p < 0.0001.
HDI: Each unit increase in HDI is associated with a substantial increase in the log of income by 6.2258, which is highly significant (p < 0.0001). This indicates a strong positive relationship between country human development status and individual income.
demScore: The democracy score has a coefficient of -0.0772, but this is not statistically significant (p = 0.441), suggesting that within the context of this model, democracy score does not have a clear impact on income.
Random Effects Estimates:
var(_cons): The variance of the random intercepts for countries is 0.2513 with a standard error of 0.0798 suggests that there is substantial variation in income across countries. This is statistically significant, as the confidence interval ranging from 0.1348952 to 0.4682901 does not include zero.
var(Residual): The within-country variance in income after accounting for the fixed effects is 0.8969, indicating substantial unexplained variability within countries.
Model Diagnostic Statistics:
Wald Test: The Wald chi-square test tests the null hypothesis that the fixed effects are simultaneously zero. The chi-square statistic of 1538.21 with 5 degrees of freedom and a p-value of 0.0000 suggests that the model's predictors (i.e., education, age, gender, HDI, and demScore) significantly help in explaining the variability in income.
Likelihood Ratio Test: The likelihood ratio (LR) test against a linear model without random effects yields a chi-square value of 4486.86 with a p-value of 0.0000, indicating that the mixed-effects model fits significantly better than a simpler linear model without random effects.
Postestimation: Comparing Models using Likelihood-ratio Test
We can use the likelihood-ratio test (lrtest) to compare the model with level 1 predictors with the model with both level 1 and level 2 predictors. This test compares the “log likelihood” (as shown in the output table below) of two models and tests whether they are significantly different. Use the following Stata codes:
Stata will give us the following outputs:
The null hypothesis is that there is no significant difference between the two models. If Prob > chi2 <0.05, then you may reject the null and conclude that there is a statistically significant difference between the models. In the example above, we reject the null and conclude that the model with both level 1 and level 2 predictors provides a better fit (as it has the lowest log likelihood).
4. Varying-Intercept, Varying-Slope Model
The Varying-Intercept, Varying-Slope Model (also known as the Varying-Coefficient, Varying-Intercept Model or simply the Random Slopes Model) is an extension of the basic mixed effects model framework. In a varying-intercept, varying-slope model, both intercepts and coefficients vary across groups simultaneously. If we want to covariate intercepts and slopes for only two of the level 1 predictors (education and age), use the following codes in Stata. Depending on your research questions or conceptual framework, you can select more or fewer predictors to covary (we have selected two predictors for demonstration purposes only).
Here,
mixed: the command in Stata for fitting mixed effects models.
income: dependent variable.
education age gender HDI demScore: fixed effects predictors (both level 1 and level 2).
|| country: education age : specifies the random effects part of the model, where
covariance(unstructured): specifies the covariance structure for the random effects.
, mle: specifies that the model should be estimated using Maximum Likelihood Estimation (MLE). This is the default estimation method for mixed effects models in Stata.
Running the above codes will give us the following outputs:
Interpretation
The above output table includes fixed effects estimates, random effects variance-covariance components, and other model diagnostic statistics.
Fixed Effects Estimates:
education: Completing secondary education (coded as 1) is associated with an approximate 0.5504 unit increase in the logarithm of income compared to those who have not completed secondary education (coded as 0). Given that income is log-transformed, this coefficient corresponds to a 73.0% increase in income, calculated as (e0.5504−1)×100%. This indicates that individuals with secondary education earn approximately 73.0% more than their counterparts without this education level, holding other factors constant. This effect is statistically significant at a 1% level as the p-value is < 0.0001. This suggests a robust and substantial impact of secondary education on income levels.
age: Each additional year of age is associated with a 0.0019 unit increase in the logarithm of income, which is not statistically significant at the conventional 5% level, as the p-value is 0.085.
gender: The coefficient for gender is -0.0914, indicating that females (coded as 1) earn less than males (coded as 0), holding all other factors constant. Given that income is log-transformed, this coefficient corresponds to an 8.7% lower income for females compared to males, calculated as (e−-0.0914−1)×100%. In other words, females earn approximately 8.7% less than males, holding all other variables constant. This result is statistically significant at the 1% level as p < 0.0001.
HDI: Each unit increase in HDI is associated with a substantial increase in the log of income by 6.1938, which is highly significant (p < 0.0001). This indicates a strong positive relationship between country human development status and individual income.
demScore: The democracy score has a coefficient of -0.0612, but this is not statistically significant (p = 0.532), suggesting that within the context of this model, democracy score does not have a clear impact on income.
Random Effects Estimates:
var(education): The variance of the random slope for education indicates the extent to which the effect of education on income varies across countries. The positive variance estimate (0.0552) suggests significant heterogeneity in how education impacts income between countries. The confidence interval shows that this variability is statistically significant, as it does not include zero.
var(age): The variance of the random slope for age is very small (0.00002), indicating that there is minimal variability in the effect of age on income across countries. The confidence interval suggests that this variance is statistically significant, but the small magnitude implies that age has a relatively stable effect on income regardless of the country.
var(_cons): The variance of the random intercept reflects the variability in baseline income levels across countries after accounting for the fixed effects. A significant intercept variance (0.3722) indicates that there are substantial differences in the average income levels across countries, suggesting that country-specific factors influence baseline income levels.
cov(education,age):The positive covariance estimate (0.00054) suggests a slight positive relationship between the random slopes for education and age. This means that in countries where the effect of education on income is stronger, the effect of age on income is also likely to be stronger. However, the confidence interval includes zero, indicating that this relationship is not statistically significant.
cov(education,_cons): The negative covariance (-0.0746) suggests that in countries with higher baseline income levels, the impact of education on income tends to be weaker. However, the confidence interval includes zero, which implies that this relationship is not statistically significant.
var(Residual): The residual variance represents the variability in income that is not explained by the model, after accounting for both fixed and random effects. A residual variance of 0.8832 indicates that there is still substantial variability in income that is not captured by the predictors and random effects in the model.
Model Diagnostic Statistics:
Wald Test: The Wald chi-square test tests the null hypothesis that the fixed effects are simultaneously zero. The chi-square statistic of 192.76 with 5 degrees of freedom and a p-value of 0.0000 suggests that the model's predictors (i.e., education, age, gender, HDI, and demScore) significantly help in explaining the variability in income.
Likelihood Ratio Test: The likelihood ratio (LR) test against a linear model without random effects yields a chi-square value of 4715.00 with a p-value of 0.0000, indicating that the mixed-effects model fits significantly better than a simpler linear model without random effects.
* The dataset is obtained from Constantelos et al. (2023), available in Harvard Dataverse, and modified for demonstration purposes.
If you have questions or comments about this guide or method, please email data@Princeton.edu.