Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

DID in Stata: Differences-in-Differences Stata tutorial

Basic differences-in-differences estimation using Stata

Differences-in-Differences Stata tutorial

Using "basic" method

  • Getting sample data.

use "https://dss.princeton.edu/training/Panel101.dta", clear

  • Create a dummy variable to indicate the time when the treatment started. Let's assume that the treatment started in 1994. In this case, years before 1994 will have a value of 0, and years from 1994 onward a 1.

gen time = (year>=1994) & !missing(year)

  • Create a dummy variable to identify the group exposed to the treatment. In this example, let's assume that countries with code 5, 6, and 7 were treated (=1). Countries 1-4 were not treated (=0).

gen treated = (country>4) & !missing(country)

  • Create an interaction between time and treated. We will call this interaction ‘did’

gen did = time*treated

  • Estimating the DID estimator

reg y time treated did, r

. reg y time treated did, r

Linear regression                               Number of obs     =         70
                                           F(3, 66)          =       2.17
                                           Prob > F          =     0.0998
                                           R-squared         =     0.0827
                                           Root MSE          =     3.0e+09
--------------------------------------------------------------------------------
        |                 Robust         
      y |  Coefficient    std. err.      t       p>|t|     [95% conf. interval]
--------------------------------------------------------------------------------
   time |  2.29e+09       9.00e+08      2.54     0.013      4.92e+08   4.09e+09
treated |  1.78e+09       1.05e+09      1.70     0.094     -3.11e+08   3.86e+09
    did | -2.52e+09       1.45e+09     -1.73     0.088     -5.42e+09   3.81e+08
  _cons | 3.58e+08        7.61e+08      0.47     0.640     -1.16e+09   1.88e+09
--------------------------------------------------------------------------------
  • The coefficient for ‘did’ is the differences-in-differences estimator. The effect is significant at 10% level, with the treatment having a negative effect.

Using "hastag" method

  • No need to generate interaction while using the hastag method. Estimate using the following command

reg y time##treated, r

   . reg y time##treated, r
   Linear regression                                 Number of obs     =        70
                                           F(3, 66)          =       2.17
                                           Prob > F          =     0.0998
                                           R-squared         =     0.0827
                                           Root MSE          =    3.0e+09
-----------------------------------------------------------------------------------
            |                 Robust         
         y  |  Coefficient    std. err.      t      p>|t|      [95% conf. interval]
-----------------------------------------------------------------------------------
     1.time |  2.29e+09       9.00e+08      2.54     0.013      4.92e+08   4.09e+09
  1.treated |  1.78e+09       1.05e+09      1.70     0.094     -3.11e+08   3.86e+09
time##treated
        1 1 | -2.52e+09       1.45e+09     -1.73     0.088     -5.42e+09   3.81e+08
      _cons | 3.58e+08        7.61e+08      0.47     0.640     -1.16e+09   1.88e+09
-----------------------------------------------------------------------------------
  • The coefficient for ‘time#treated’ is the differences-in-differences estimator (‘did’ in the previous example). The effect is significant at 10%, with the treatment having a negative effect.

Using the "diff" command

  • The command diff is user‐defined for Stata. To install, type

ssc install diff

  • Estimating using the diff command

diff y, t(treated) p(time)

Note: "treated" and "time" in parentheses are dummies for treatment and time; see the "basic" method

. diff y, t(treated) p(time)

Number of observations in the DIFF-IN-DIFF: 70
    Baseline        Follow-up
Control:  16              24          40 
Treated:  12              18          30
     28              42
---------------------------------------------------------------
 Outcome var.   |  y       | S. Err.   |  t       | P>|t|
---------------------------------------------------------------
Baseline
Control    |  3.6e+08 |           |         |
Treated    |  2.1e+09 |           |         |
Diff (T-C) |  1.8e+09 |  1.1e+09  | 1.58    | 0.120
Follow-up
Control    |  2.6e+09 |           |         |
Treated    |  1.9e+09 |           |         |
Diff (T-C) | -7.4e+08 |  9.2e+08  | -0.81   | 0.422
Diff-in-Diff    | -2.5e+09 |  1.5e+09  | -1.73   | 0.088* 
------------------------------------------------------------------
R-square:    0.08
* Means and Standard Errors are estimated by linear regression
**Inference: *** p<0.01; ** p<0.05; * p<0.1

Note: the highlighted number (0.088) is the p-value for the treatment effect, or DID estimator

** Type help diff for more details/options

References/Additional Reading

Angrist, J. D., & Pischke, J. S. (2009). Mostly harmless econometrics: An empiricist's companion. Princeton University Press.
 
Greene, W. H. (2018). Econometric analysis (8th ed.). Pearson.
 
Stock, J. H., & Watson, M. W. (2019). Introduction to econometrics (4th ed.). Pearson.
 
Waldinger, F. (n.d.). Lecture 3: Differences-in-Differences. Available ate: https://silo.tips/download/lecture-3-differences-in-differences, accessed August, 10(2022).
 
Wooldridge, J. (2007). What’s new in econometrics? Lecture 10 difference-in-differences estimation. NBER Summer Institute, available at: https://www.nber.org/sites/default/files/2021-03/slides_10_diffindiffs.pdf, accessed August, 8(2022).

Data Consultant

Profile Photo
Muhammad Al Amin
He/Him/His
Contact:
Firestone Library, A-12-F.1
609-258-6051

Data Consultant

Profile Photo
Yufei Qin
Contact:
Firestone Library, A.12F.2
6092582519

Comments or Questions?

If you have questions or comments about this guide or method, please email data@Princeton.edu.