Research Guides: Logit, Ordered Logit, and Multinomial Logit Models in R: A Hands-on Tutorial

1.Logit, Ordered logit and Multinomial logit models concepts

It is not suggested to use simple linear regressions when the outcome variables are dichotomous or dummy.

'Many social phenomena are discrete or qualitative rather than continuous or quantitative in nature—an event occurs or it does not occur, a person makes one choice but not the other, an individual or group passes from one state to another' (Pampel,2020).

In these cases, we can use the alternatives: Logit, Ordered Logit and Multinomial logit models.

Logit models

If outcome or dependent variable is binary and in the form 0/1, then use logit or Intro probit models. Some examples are:

Did you vote in the last election?

0 'No'

1 'Yes'

Do you prefer to use public transportation or to drive a car?

0 'Prefer to drive'

1 'Prefer public transport'

Ordered logit models

If outcome or dependent variable is categorical but are ordered (i.e. low to high), then use ordered logit or ordered probit models. Some examples are:

Do you agree or disagree with the President?

1 'Disagree'

2 'Neutral'

3 'Agree'

What is your socioeconomic status?

1 'Low'

2 'Middle'

3 'High'

Multinomial Logit

If outcome or dependent variable is categorical without any particular order, then use multinomial logit. Some examples are:

If elections were held today, for which party would you vote?

1 'Democrats'

2 'Independent'

3 'Republicans'

What do you like to do on the weekends?

1 'Rest'

2 'Go to movies'

3 'Exercise'

2. Logit models

2.1. Log-Odds ratio of logit models

Getting sample data.

library(foreign)
mydata <- read.dta("https://dss.princeton.edu/training/Panel101.dta")

Run a logit model

logit <- glm(y_bin ~ x1 + x2 + x3, family=binomial(link="logit"), data=mydata) 
summary(logit)
Call:
glm(formula = y_bin ~ x1 + x2 + x3, family = binomial(link = "logit"), 
    data = mydata)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.0277   0.2347   0.5542   0.7016   1.0839  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)  
(Intercept)   0.4262     0.6390   0.667   0.5048  
x1            0.8618     0.7840   1.099   0.2717  
x2            0.3665     0.3082   1.189   0.2343  
x3            0.7512     0.4548   1.652   0.0986 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 70.056  on 69  degrees of freedom
Residual deviance: 65.512  on 66  degrees of freedom
AIC: 73.512

Number of Fisher Scoring iterations: 5

From the p-value column here tests the null hypothesis that the coefficient is equal to zero (i.e.no significant effect). The usual value is 0.05, by this measure none of the coefficients have a significant effect on the log-odds ratio of the dependent variable. The coefficient for x3 is significant at 10% (<0.10).

Note that the Estimate column here shows the coefficients in log-odds form. This means that when x3 increase by one unit, the expected change in the log odds is 0.7512. What you get from this column is whether the effect of the predictors is positive or negative. For better interpretation, we can look at the odds ratio of the model.

2.2. Odds ratio of logit models

We can use odds ratio for better interpretation. To get the odds ratio, you need to explonentiate the logit coefficient.

We can do so by hand:

cbind(Estimate=round(coef(logit),4),
      OR=round(exp(coef(logit)),4))
              Estimate   OR
(Intercept)   0.4262 1.5314
x1            0.8618 2.3674
x2            0.3665 1.4427
x3            0.7512 2.1196

Here, to interpret the odds ratio of x3, we can say thatwhen x3 increases by one unit, the odds of y = 1 increase by 112% ((2.12-1)*100). Or, the odds of y =1 are 2.12 times higher when x3 increases by one unit (keeping all other predictors constant).

We can also have the same results using the mfx package.

library(mfx)
logitor(y_bin ~ x1 + x2 + x3, data=mydata)
Call:
logitor(formula = y_bin ~ x1 + x2 + x3, data = mydata)

Odds Ratio:
   OddsRatio Std. Err.      z   P>|z|  
x1   2.36735   1.85600 1.0992 0.27168  
x2   1.44273   0.44459 1.1894 0.23427  
x3   2.11957   0.96405 1.6516 0.09861 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

2.3. Output tables of logit models

Now we have the model outputs, we may want to present them in a format suitable for publication.

We can use the stargazer() function from the package -stargazer to do so.

First we look at the output table for the raw estimations of logit coefficients.

library(stargazer)
stargazer(logit, type="html", out="logit.htm")
#use type = 'text' if you want to see the results directly in the RStudio console

The model will be saved in the working directory under the name ‘logit.htm’ which you can open with Word or any other word processor. If you are not sure about your current working directory, you can use the getwd() function to see.

Now we can look at the output table.

Odds ratios are easier interpretation of the logit coefficients. We may want to print the output table of odds ratios instead of that of the raw estimates.

#we first create the values of the odds ratio
logit.or = exp(coef(logit))
logit.or
(Intercept)          x1          x2          x3 
   1.531417    2.367352    1.442727    2.119566
#then we create the output table using the stargazer function
stargazer(logit, type="html", coef=list(logit.or), p.auto=FALSE, out="logitor.htm")

To interpret the odds ratios:

Keeping all other variables constant, when x1 increases one unit, it is 2.367 times more likely to be in the 1 category. In other words, the odds of being in the 1 category (as opposed to the 0 category) are 137% higher when x1 move one unit (2.37 – 1). The coefficient, however, is not significant.

2.4 Predicted probabilities of logit models

After building the model, we can estimate the probability that the outcome variable (Y) = 1 using the given x values.

We can use this function to calculate the probabilities (Gelman and Hill, 2007):

Recall our logit model:

logit <- glm(y_bin ~ x1 + x2 + x3, family=binomial(link="logit"), data=mydata)
coef(logit)
(Intercept)          x1          x2          x3 
  0.4261935   0.8617722   0.3665348   0.7512115

Therefore, with these coefficients, the equation of calculating the predicted probability that the outcome variable is equal to one can be written as:

Estimating the probability at the mean point of each predictor can be done by inverting the logit model. Gelman and Hill provide a function for this (p. 81), also available in the R package –arm-.

invlogit = function (x) {1/(1+exp(-x))}
Prob<-invlogit(coef(logit)[1]+
           coef(logit)[2]*mean(mydata$x1)+
           coef(logit)[3]*mean(mydata$x2)+
           coef(logit)[4]*mean(mydata$x3))
Prob
(Intercept) 
  0.8328555

This means that the predicted probability of the outcome variable being 1 is around 83% when the x variables take their mean values.

Now we add categorical variables to the model.

logit.cat <- glm(y_bin ~ x1 + x2 + x3 + opinion,
                 family=binomial(link="logit"), data=mydata)
coef(logit.cat)
     (Intercept)               x1               x2               x3 
       0.8816118        1.1335562        0.3021217        0.3976276 
    opinionAgree     opinionDisag opinionStr disag 
      -1.9163569        0.3270627        0.6891686

Estimating the probability when opinion = ‘Agree’

invlogit = function (x) {1/(1+exp(-x))}
Prob_agree<-invlogit(coef(logit.cat)[1]+
           coef(logit.cat)[2]*mean(mydata$x1)+
           coef(logit.cat)[3]*mean(mydata$x2)+
           coef(logit.cat)[4]*mean(mydata$x3)+
           coef(logit.cat)[5]*1)
Prob_agree
(Intercept) 
  0.5107928

Estimating the probability when opinion = ‘Disagree’

invlogit = function (x) {1/(1+exp(-x))}
Prob_disagree<-invlogit(coef(logit.cat)[1]+
                       coef(logit.cat)[2]*mean(mydata$x1)+
                       coef(logit.cat)[3]*mean(mydata$x2)+
                       coef(logit.cat)[4]*mean(mydata$x3)+
                       coef(logit.cat)[6]*1)
Prob_disagree
(Intercept) 
  0.9077609

Estimating the probability when opinion = ‘Strongly disagree’

invlogit = function (x) {1/(1+exp(-x))}
Prob_str_disagree<-invlogit(coef(logit.cat)[1]+
                          coef(logit.cat)[2]*mean(mydata$x1)+
                          coef(logit.cat)[3]*mean(mydata$x2)+
                          coef(logit.cat)[4]*mean(mydata$x3)+
                          coef(logit.cat)[7]*1)
Prob_str_disagree
(Intercept) 
   0.933931

Estimating the probability when opinion = ‘Strongly agree’

invlogit = function (x) {1/(1+exp(-x))}
Prob_str_agree<-invlogit(coef(logit.cat)[1]+
                              coef(logit.cat)[2]*mean(mydata$x1)+
                              coef(logit.cat)[3]*mean(mydata$x2)+
                              coef(logit.cat)[4]*mean(mydata$x3)
                              )
Prob_str_agree
(Intercept) 
  0.8764826

Using the predict function

Another way to estimate the predicted probabilities is by setting initial conditions. We will get predicted probabilities holding all predictors or independent variables to their means.

First create a new dataset with the mean values of the predictors.

allmean <- data.frame(x1=mean(mydata$x1),
                      x2=mean(mydata$x2),
                      x3=mean(mydata$x3))
allmean
   x1        x2       x3
1 0.6480006 0.1338694 0.761851

After estimating the logit model and creating the dataset with the mean values of the predictors, you can use the predict() function to estimate the predicted probabilities (for help/details type ?predict.glm), and add them to the allmean dataset.

allmean$pred.prob <- predict(logit, newdata=allmean, type="response")
allmean
         x1        x2       x3 pred.prob
1 0.6480006 0.1338694 0.761851 0.8328555

When all predictor values are hold to their means, the probability of y = 1 is 83%.

Similarly, we can do that for the model adding categorical variables as well.

Again, we create a new dataset first.

allmean <- data.frame(x1=rep(mean(mydata$x1),4),
                      x2=rep(mean(mydata$x2),4),
                      x3=rep(mean(mydata$x3),4),
                      opinion=as.factor(c("Str agree","Agree","Disag","Str disag")))
allmean
         x1        x2       x3   opinion
1 0.6480006 0.1338694 0.761851 Str agree
2 0.6480006 0.1338694 0.761851     Agree
3 0.6480006 0.1338694 0.761851     Disag
4 0.6480006 0.1338694 0.761851 Str disag

Then we use the predict function again.

#first create the model with categorical variable
logit <- glm(y_bin ~ x1+x2+x3+opinion, family=binomial(link="logit"), data=mydata)
allmean <- cbind(allmean,predict(logit, newdata=allmean, type="response", se.fit=TRUE))
allmean
       x1        x2       x3   opinion         fit     se.fit residual.scale
1 0.6480006 0.1338694 0.761851 Str agree 0.8764826 0.07394431              1
2 0.6480006 0.1338694 0.761851     Agree 0.5107928 0.15099064              1
3 0.6480006 0.1338694 0.761851     Disag 0.9077609 0.06734568              1
4 0.6480006 0.1338694 0.761851 Str disag 0.9339310 0.06446677              1

The fit column gives the predicted probabilities and the se.fit column gives the standard error of the prediction.

We can manually rename these two columns and we can estimate the confidence intervals.

# Renaming "fit" and "se.fit" columns
names(allmean)[names(allmean)=="fit"] = "prob"
names(allmean)[names(allmean)=="se.fit"] = "se.prob"
# Estimating confidence intervals
allmean$ll = allmean$prob - 1.96*allmean$se.prob
allmean$ul = allmean$prob + 1.96*allmean$se.prob
allmean
         x1        x2       x3   opinion      prob    se.prob residual.scale        ll        ul
1 0.6480006 0.1338694 0.761851 Str agree 0.8764826 0.07394431              1 0.7315518 1.0214134
2 0.6480006 0.1338694 0.761851     Agree 0.5107928 0.15099064              1 0.2148511 0.8067344
3 0.6480006 0.1338694 0.761851     Disag 0.9077609 0.06734568              1 0.7757634 1.0397585
4 0.6480006 0.1338694 0.761851 Str disag 0.9339310 0.06446677              1 0.8075762 1.0602859

Having the estimated confidence intervals, we can plot them.

library(ggplot2)
ggplot(allmean, aes(x=opinion, y = prob)) +
  geom_errorbar(aes(ymin = ll, ymax = ul), width = 0.2, lty=1, lwd=1, col="red") +
  geom_point(shape=18, size=5, fill="black") +
  scale_x_discrete(limits = c("Str agree","Agree","Disag","Str disag")) +
  labs(title= " Predicted probabilities", x="Opinion", y="Pr(y=1)", 
  caption = "add footnote here") +
  theme(plot.title = element_text(family = "sans", face="bold", size=13, hjust=0.5),
        axis.title = element_text(family = "sans", size=9),
        plot.caption = element_text(family = "sans", size=5))

2.5. Marginal effects of logit models

Marginal effects show the change in probability when the predictor or independent variable increases by one unit. For continuous variables this represents the instantaneous change given that the ‘unit’ may be very small. For binary variables, the change is from 0 to 1, so one ‘unit’ as it is usually thought.

We can use the logitmfx() function in the mfx package

library(mfx)
logitmfx(y_bin ~ x1+x2+x3, data=mydata)
Call:
logitmfx(formula = y_bin ~ x1 + x2 + x3, data = mydata)

Marginal Effects:
      dF/dx Std. Err.      z   P>|z|  
x1 0.119965  0.104836 1.1443 0.25249  
x2 0.051024  0.041155 1.2398 0.21504  
x3 0.104574  0.053890 1.9405 0.05232 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Interpretation of the numbers in dF/dx column :

x1 = 0.119965 The change in probability for one instant change in x1 is 12 percentage points (pp). However, the change is not statistically significant because the p-value is not <0.05.

x2 = 0.051024 The change in probability for one instant change in x2 is 3 percentage points (pp). However, the change is not statistically significant because the p-value is not <0.05.

x3 = 0.104574 The change in probability for one instant change in x3 is 10 percentage points (pp). The change can be seen as statistically significant because the p-value is around 0.05, even though it's slightly higher.

we can add the categorical variables in it too.

library(mfx)
logitmfx(y_bin ~ x1+x2+x3+opinion, data=mydata)
Call:
logitmfx(formula = y_bin ~ x1 + x2 + x3 + opinion, data = mydata)

Marginal Effects:
                     dF/dx Std. Err.       z   P>|z|  
x1                0.138463  0.109393  1.2657 0.20560  
x2                0.036904  0.042107  0.8764 0.38079  
x3                0.048570  0.054832  0.8858 0.37573  
opinionAgree     -0.328904  0.192989 -1.7043 0.08833 .
opinionDisag      0.037897  0.108900  0.3480 0.72784  
opinionStr disag  0.073858  0.114590  0.6445 0.51922  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

dF/dx is for discrete change for the following variables:

[1] "opinionAgree"     "opinionDisag"     "opinionStr disag"

In this case:

Agree = -0.328904 The change in probability when opinion goes from ‘strongly agree’ to ‘agree’ decreases by 33 percentage points or -0.33. However, the change is not statistically significant because the p-value is not <0.05.

Disag = 0.037897 The change in probability when opinion goes from ‘strongly agree’ to ‘disagree’ increases by 3 percentage points or 0.03. However, the change is not statistically significant because the p-value is not <0.05.

Str Disag = 0.073858 The change in probability when opinion goes from ‘strongly agree’ to ‘strongly disagree’ increases by 7 percentage points or 0.07. However, the change is not statistically significant because the p-value is not <0.05.

3. Ordinal logit models

When the outcome variable is categorical but ordered, we should use the ordered logit models.

3.1. Log-odds ratio of ordered logit models

Getting sample data

library(foreign)
mydata <- read.dta("https://dss.princeton.edu/training/Panel101.dta")

Run the model

# Loading library –MASS
library(MASS)
# Running the ordered logit model
m1 <- polr(opinion ~ x1 + x2 + x3, data=mydata, Hess=TRUE)
summary(m1)
Call:
polr(formula = opinion ~ x1 + x2 + x3, data = mydata, Hess = TRUE)

Coefficients:
     Value Std. Error t value
x1 0.98140     0.5641  1.7397
x2 0.24936     0.2086  1.1954
x3 0.09089     0.1549  0.5867

Intercepts:
                Value   Std. Error t value
Str agree|Agree -0.2054  0.4682    -0.4388
Agree|Disag      0.7370  0.4697     1.5690
Disag|Str disag  1.9951  0.5204     3.8335

Residual Deviance: 189.6382 
AIC: 201.6382

We can store the coefficients and p-values into one new dataframe.

m1.coef <- data.frame(coef(summary(m1)))
m1.coef$pval = round((pnorm(abs(m1.coef$t.value), lower.tail = FALSE) * 2),2)
m1.coef
                      Value Std..Error    t.value pval
x1               0.98139603  0.5641136  1.7397134 0.08
x2               0.24935530  0.2086027  1.1953599 0.23
x3               0.09089175  0.1549254  0.5866807 0.56
Str agree|Agree -0.20542664  0.4682027 -0.4387558 0.66
Agree|Disag      0.73696754  0.4696907  1.5690486 0.12
Disag|Str disag  1.99507902  0.5204282  3.8335334 0.00

Similarly, we can calculate the odds ratios by taking the exponential of these coefficients.

cbind(Estimate=round(coef(m1),4),
      OR=round(exp(coef(m1)),4))
  Estimate     OR
x1   0.9814 2.6682
x2   0.2494 1.2832
x3   0.0909 1.0952

Here, keeping all other variables constant, when x1 increases one unit, it is 2.668 times more likely to be in a higher category. In other words, the odds of moving to a higher category in the outcome variable is 166% when x1 move one unit (2.66 – 1). The coefficient is significant.

3.2. Output tables of ordered logit models

Again, we can use the stargazer() function from the package -stargazer to create presentable tables.

First we look at the output table for the raw estimations of logit coefficients.

library(stargazer)
stargazer(m1, type="html", out="m1.htm")

Similarly, the model will be saved in the working directory under the name ‘logit.htm’ which you can open with Word or any other word processor. If you are not sure about your current working directory, you can use the getwd() function to see.

Now we can look at the output table.

Similarly, we can print the result of the odds ratio instead of the log-odds(raw estimates).

m1.or=exp(coef(m1))
m1.or

stargazer(m1, type="html", coef=list(m1.or), p.auto=FALSE, out="m1or.htm")

3.3. Predicted probabilities of ordered logit models

We use the predict() function for predicted probabilities.

m1.pred <- predict(m1, type="probs")
summary(m1.pred)
   Str agree          Agree            Disag          Str disag      
 Min.   :0.1040   Min.   :0.1255   Min.   :0.1458   Min.   :0.07418  
 1st Qu.:0.2307   1st Qu.:0.2038   1st Qu.:0.2511   1st Qu.:0.17350  
 Median :0.2628   Median :0.2144   Median :0.2851   Median :0.23705  
 Mean   :0.2869   Mean   :0.2124   Mean   :0.2715   Mean   :0.22923  
 3rd Qu.:0.3458   3rd Qu.:0.2271   3rd Qu.:0.2949   3rd Qu.:0.26968  
 Max.   :0.5802   Max.   :0.2313   Max.   :0.3045   Max.   :0.48832

The bold numbers are the predicted probabilities of each category when all predictors are at their mean value.

Here is an example when we want to calculate the predicted probabilities when at specific values.

We set x1 and x2 at their means, and we see when x3=1 and x3=2.

We first set up the values in a dataframe.

setup1 <- data.frame(x1=rep(mean(mydata$x1),2),
                     x2=rep(mean(mydata$x2),2),
                     x3=c(1,2))
setup1
         x1        x2 x3
1 0.6480006 0.1338694  1
2 0.6480006 0.1338694  2

Then we calculate the predicted probabilities.

setup1[, c("pred.prob")] <- predict(m1, newdata=setup1, type="probs")
setup1
         x1        x2 x3 pred.prob.Str agree pred.prob.Agree pred.prob.Disag pred.prob.Str disag
1 0.6480006 0.1338694  1           0.2757495       0.2184382       0.2804806           0.2253318
2 0.6480006 0.1338694  2           0.2579719       0.2135235       0.2869123           0.2415923

We use type='probs' for predicted probabilities given specific predictors.

We can use 'type='class' too. In this case, the predict() function gives you a predicted class instead of probability.

setup1[, c("pred.prob")] <- predict(m1, newdata=setup1, type="class")
setup1
         x1        x2 x3 pred.prob
1 0.6480006 0.1338694  1     Disag
2 0.6480006 0.1338694  2     Disag

3.4. Marginal effects of ordered logit models

We use the ocMe() function in package 'erer' to estimate the marginal effects.

library(erer)
x <- ocME(m1)
x
   effect.Str agree effect.Agree effect.Disag effect.Str disag
x1           -0.198       -0.047        0.076            0.169
x2           -0.050       -0.012        0.019            0.043
x3           -0.018       -0.004        0.007            0.016

Interpretation:

The marginal effects indicate that for one instant change in x1, it is 17 percentage points more likely to strongly disagree, 8 percentage points more likely to disagree, 5 percentage points less likely to agree, and 20 percentage points less likely to strongly agree.

4. Multinomial logit models

When the outcome variable is categorical but not ordered, we should use the multinomial logit models.

4.1. Coefficients of Multinomial logit models

Loading the required packages

library(foreign)
library(nnet)
library(stargazer)

Now getting the sample data from UCLA

mydata=read.dta("https://stats.idre.ucla.edu/stat/data/hsbdemo.dta")

Checking the output (dependent) variable

table(mydata$ses)
 low middle   high 
    47     95     58

By default, the first category is the reference. we can change it so ‘middle’ is the reference type:

mydata$ses2 = relevel(mydata$ses, ref = "middle")

Note: This section is based on the UCLA website https://stats.oarc.ucla.edu/r/dae/multinomial-logistic-regression/. Results here reproduce the output in the latter to compare, and to provide an additional source to interpret outcomes.

Now we run the multinomial logit model using the multinom() function

multi1 = multinom(ses2 ~ science + socst + female, data=mydata)
summary(multi1)
Call:
multinom(formula = ses2 ~ science + socst + female, data = mydata)

Coefficients:
     (Intercept)     science       socst femalefemale
low     1.912288 -0.02356494 -0.03892428   0.81659717
high   -4.057284  0.02292179  0.04300323  -0.03287211

Std. Errors:
     (Intercept)    science      socst femalefemale
low     1.127255 0.02097468 0.01951649    0.3909804
high    1.222937 0.02087182 0.01988933    0.3500151

Residual Deviance: 388.0697 
AIC: 404.0697

These are the logit coefficients relative to the reference category. For example, under ‘science’, the -0.02 suggests that for one unit increase in ‘science’ score, the logit coefficient for ‘low’ relative to ‘middle’ will go down by that amount, -0.02. In other words, if your science score increases one unit, your chances of staying in the middle ses category are higher compared to staying in low ses.

4.2. Output tables and relative risk ratios of multinomial logit models

The multinom() function does not provide p-values, you can get significance of the coefficients using the stargazer() function from the package –stargazer.

library(stargazer)
stargazer(multi1, type="html", out="multi1.htm")

The model will be saved in the working directory under the name ‘multi1.htm’ which you can open with Word or any other word processor. If you are not sure about your current working directory, you can use the getwd() function to see.

Also, use the option type = "text" if you want to see the results directly in the RStudio console.

Similarly, we can use the relative risk ratios to interpret the logit coefficients. Like log odds, they are the exponentiated value of the logit coefficients.

multi1.rrr = exp(coef(multi1))
multi1.rrr
     (Intercept)   science     socst femalefemale
low   6.76855944 0.9767105 0.9618235    2.2627869
high  0.01729593 1.0231865 1.0439413    0.9676623

Now we create the output table as well.

library(stargazer)
stargazer(multi1, type="html", coef=list(multi1.rrr), p.auto=FALSE, out="multi1rrr.htm")

Interpretation:

Keeping all other variables constant, if your science score increases one unit, you are 0.97 times more likely to stay in the low ses category as compared to the middle ses category (the risk or odds is 3% lower). The coefficient, however, is not significant.

Interpretation:

Keeping all other variables constant, if your science score increases one unit, you are 1.02 times more likely to stay in the high ses category as compared to the middle ses category (the risk or odds is 2% higher). The coefficient, however, is not significant.

4.3. Predicted probabilities

We can combine other binary variables in as well when calculating predicted probabilities

At specific values, example science and socst at their means for males and females.

allmean <- data.frame(science=rep(mean(mydata$science),2),
                      socst=rep(mean(mydata$socst),2),
                      female = c("male","female"))
allmean
  science  socst female
1   51.85 52.405   male
2   51.85 52.405 female

Again, we use "probs" for predicted probabilities given specific predictors

allmean[, c("pred.prob")] <- predict(multi1, newdata=allmean, type="probs")
allmean
 science  socst female pred.prob.middle pred.prob.low pred.prob.high
1   51.85 52.405   male        0.5555769     0.1441171      0.3003061
2   51.85 52.405 female        0.4739293     0.2781816      0.2478890

For predicting categories, we use 'class'

allmean[, c("pred.prob")] <- predict(multi1, newdata=allmean, type="class")
allmean
  science  socst female pred.prob
1   51.85 52.405   male    middle
2   51.85 52.405 female    middle

Given the new data, the both predicted categories are in the middle.

Reference and Useful sources

Applied Regression Analysis and Generalized Linear Models / John Fox, Sage, 2008.
Data analysis using regression and multilevel/hierarchical models / Andrew Gelman, Jennifer Hill. Cambridge ; New York : Cambridge University Press, 2007.
DSS Data Analysis Guides. Available at https://library.princeton.edu/dss/training
Econometric analysis / William H. Greene. 8th ed., Upper Saddle River, N.J. : Prentice Hall, 2018.
Introduction to econometrics / James H. Stock, Mark W. Watson. 4th ed., Boston: Pearson Addison Wesley, 2019.
Logistic Regression: A Primer, Fred C.Pampel, Sage Publications, 2021
R package mfx: https://cran.r-project.org/web/packages/mfx/mfx.pdf
R package erer: https://cran.r-project.org/web/packages/erer/erer.pdf
StatsExchange, https://stats.stackexchange.com/
UCLA, R Data Analysis Example. Multinomial Logistic Regression. Available at https://stats.oarc.ucla.edu/r/dae/multinomial-logistic-regression/

Comments or Questions?

If you have questions or comments about this guide or method, please email data@Princeton.edu.

Data Consultant

Yufei Qin

Email Me

Contact:

Firestone Library, A.12F.2

6092582519

Data Consultant

Muhammad Al Amin

He/Him/His

Email Me

Contact:

Firestone Library, A-12-F.1

609-258-6051

Logit, Ordered Logit, and Multinomial Logit Models in R: A Hands-on Tutorial

Table of Contents

1.Logit, Ordered logit and Multinomial logit models concepts

2. Logit models

2.1. Log-Odds ratio of logit models

2.2. Odds ratio of logit models

2.3. Output tables of logit models

2.4 Predicted probabilities of logit models

2.5. Marginal effects of logit models

3. Ordinal logit models

3.1. Log-odds ratio of ordered logit models

3.2. Output tables of ordered logit models

3.3. Predicted probabilities of ordered logit models

3.4. Marginal effects of ordered logit models

4. Multinomial logit models

4.1. Coefficients of Multinomial logit models

4.2. Output tables and relative risk ratios of multinomial logit models

4.3. Predicted probabilities

Reference and Useful sources

Comments or Questions?

Data Consultant

Data Consultant

Subscribe to our Newsletter