Add normal distribution to kernel density plot in Stata

This might be already implemented in the most recent version of Stata, but I just came across the problem that there seems to be no straightforward way to combine a kernel density plot (i.e. kdensity) with a normal distribution of the underlying variable.

Continue reading “Add normal distribution to kernel density plot in Stata”

Check whether variable exists in if-conditions

In some applications, e.g. if you want to save coefficient estimates from a regression with many dummies (e.g. fixed effects), you might want to store coefficients as estimates. In this example, we are interested in storing the estimates of the GROUPVAR dummies, but not the dummies of OTHERVAR. While this is usually straightforward by writing

Continue reading “Check whether variable exists in if-conditions”

Running regressions with similar sets of variables

Quite often we run variations on regressions, including or excluding (sets of) variables. Copy-pasting the regression and eliminating the variables to be excluded is one way, but given that we speak of sets of variables why not use locals to do the work for you:
Continue reading “Running regressions with similar sets of variables”

IMR based on Logit FE

This is a way to calculate a logistic Inverse Mills ratio.

The logistic IMR has some benefits when estimating a model (including correction for selection) on panel data. Because of the incidental parameter problem, it is not possible to estimate Probit FE. Hence, many researchers use a Probit RE model for the selection equation and then estimate the main FE model including the retrieved IMR. A problem which this approach is that the assumptions made are usually not plausible (differences in the correlation between regressors and the unobserved heterogeneity terms in the selection equation and the equation of interest). Continue reading “IMR based on Logit FE”

Table of descriptives

Posted by Didier

This illustrates ways to make a tables of descriptives (mean or something else) for many variables (say wage, tenure, education, …) and several groups (say males and female). Neither summarize or tabstat are useful if the variables are many. With summarize, you would need to cut, paste and edit the output in e.g. Excel. With tabstat, the tabel would be too wide. Continue reading “Table of descriptives”