In some applications, e.g. if you want to save coefficient estimates from a regression with many dummies (e.g. fixed effects), you might want to store coefficients as estimates. In this example, we are interested in storing the estimates of the `GROUPVAR`

dummies, but not the dummies of `OTHERVAR`

. While this is usually straightforward by writing

# Category Archives: Stata – Tests and Regressions

# Robust Hausman test for FE vs RE

For quite a while I was writing a program to perform a Hausman test to compare Fixed vs Random Effects in Stata when the estimates were calculated using cluster-robust standard errors, since in this case the usual Hausman test is not suitable. Just when I was about to finish with it, I found out that there is already a user-written program that does this and several other tests that fall into the “overidentifying restrictions” class for panel models. The command name is *xtoverid *and it can be very useful, especially if you are using panel IV methods or Hausman-Taylor models.

*xtoverid *

# Wald test

In order to perform a wald test in stata, you can simply use the “test” command.

Continue reading

# Plotting robustanalyses based on changes in definition of explanatory variable

The following might be of use when you want to plot robustanalyses based on changes in definition of explanatory variables. I give an example: I analyze the effect of firms’ share of part-time employment on firm productivity. I define part-time Continue reading

# Plotting marginal effects of interaction terms in stata

In case your model includes interaction terms, interpretation of results is not straightforward anymore. Due to (at least) two different standard errors, you should be careful in interpreting results as significant. I found a very nice Continue reading

# Comparing groups

In some analyses you may want to compare to groups, e.g. wages of women and men, or random assignment in an experiment. Suppose you have a continuous variable (e.g. –wage–) and a dichotomous variable (e.g. –gender–). In this case, Continue reading

# Running regressions with similar sets of variables

Quite often we run variations on regressions, including or excluding (sets of) variables. Copy-pasting the regression and eliminating the variables to be excluded is one way, but given that we speak of sets of variables why not use locals to do the work for you:

Continue reading

# IMR based on Logit FE

This is a way to calculate a logistic Inverse Mills ratio.

The logistic IMR has some benefits when estimating a model (including correction for selection) on panel data. Because of the incidental parameter problem, it is not possible to estimate Probit FE. Hence, many researchers use a Probit RE model for the selection equation and then estimate the main FE model including the retrieved IMR. A problem which this approach is that the assumptions made are usually not plausible (differences in the correlation between regressors and the unobserved heterogeneity terms in the selection equation and the equation of interest). Continue reading

# Age regressions

The following do-file can be used to plot age regressions that do not assume any functional form. It uses dummies for each year of age. Note though, that you need a dataset that has sufficient observations to estimate it. Continue reading

# Table of descriptives

*Posted by Didier*

This illustrates ways to make a tables of descriptives (mean or something else) for many variables (say wage, tenure, education, …) and several groups (say males and female). Neither summarize or tabstat are useful if the variables are many. With summarize, you would need to cut, paste and edit the output in e.g. Excel. With tabstat, the tabel would be too wide. Continue reading