Working on firm level data (again), I have the experience of cleaning up hundreds of different spelinngs of occupations that should eventually be categorized into a set of occupations that should only differ when actual different occupations are needed.Let me call the variable occupation.
34. slesar po remontu la
38. slesar po rem. la
44. slesar po rem. i obsluzh. vent. i kondicionirovaniya
54. slesar po rem. i obsluzh. ven. i kondicionirovaniya
146. slesar po rem. la
205. slesar po remontu agregatov
259. slesar po rem.agregatov
313. slesar po remontu kompressornyh ustanovok i oborudovaniya
343. slesar po remontu oborud
Wonderful mess, acutally only a minor part of the full data-set.
Continue reading “Cleaning up messy (string) variables”
Category: Stata
Age regressions
The following do-file can be used to plot age regressions that do not assume any functional form. It uses dummies for each year of age. Note though, that you need a dataset that has sufficient observations to estimate it. Continue reading “Age regressions”
Export of regression tables to LaTeX
The ado-files --esto--
and —esta
— (has to be installed by typing --findit esto--
or —findit esta
— into the Stata command window) provides a simple way to export regression tables from Stata to a separate LaTeX-file. At the same time, it is possible to adjust basically everything. I will just present a short example that I use for my regression tables (for adjusting the code, see --help esta--
). After installing the ado-packages, run (in this case) two regressions, in my case: Continue reading “Export of regression tables to LaTeX”
Export summary statistics to LaTeX
The ado-file --sutex--
(has to be installed by typing --findit sutex--
into the Stata command window) provides a simple way to export summary statistics from Stata to a separate LaTeX-file. It is limited in individualised adjustment, but quite OK for most applications. By default, you get mean and standard deviation for variables VAR1, VAR2, VAR3
, etc. A syntax could be: Continue reading “Export summary statistics to LaTeX”
Table of descriptives
Posted by Didier
This illustrates ways to make a tables of descriptives (mean or something else) for many variables (say wage, tenure, education, …) and several groups (say males and female). Neither summarize or tabstat are useful if the variables are many. With summarize, you would need to cut, paste and edit the output in e.g. Excel. With tabstat, the tabel would be too wide. Continue reading “Table of descriptives”
Using locals and loops to generate long strings
The following command can be used to generate a command which consists of several new variables which are generated within a loop.
This could either be done by (e.g. generating a number of log variables)
gen newvar1 = log(var1)
Continue reading “Using locals and loops to generate long strings”
gen newvar2 = log(var2)
etc.