Often I use locals to calculate something (such as the mean of a variable for a specific group) and use this in a loop (say, over groups or over years). If you, for example, want to calculate the share of observations that belong to one group (say:
Continue reading Format how locals are displayed in Stata
female==1), you could simply write
Do-files in Stata easily get a bit lengthy. Of course, you can try to shorten do-files and distribute code onto several do-files and have one master file that runs all of the respective sub-do-files (which are included by
Continue reading Piecewise execution of do-files in Stata
do dosubfile1.do). Alternatively, you can leave the do-file longish but write your code such that you only run parts of the code at once:
Proper use of single and double quotation marks is essential when working in Stata, especially when writing loops where locals can be a huge time and memory saver. The use of single and double quotation marks is rather straightforward (using ` and ‘ for single, and ” for double quotation marks). You can rather easily define a local, e.g. based on the average of a variable
Continue reading Use of embedded quotation marks within locals in Stata
In some applications, e.g. if you want to save coefficient estimates from a regression with many dummies (e.g. fixed effects), you might want to store coefficients as estimates. In this example, we are interested in storing the estimates of the
Continue reading Check whether variable exists in if-conditions
GROUPVAR dummies, but not the dummies of
OTHERVAR. While this is usually straightforward by writing
Unfortunately, the otherwise great Stata command
Continue reading Standardize variables by group
egen does not allow to standardize variables group, e.g. for each year separately. There is a small get-around by calculating mean and SD first, and then manually creating the standardized the variable (and then you really wonder why this is not implemented in Stata).
Especially in the process of data preparation, but also when one runs whole sets of analysis, we start repeating commands and sets of commands for similar variables. For example in one of my projects, I had to process salary information, that was monthly in wide format, and for several reasons I could not use reshape:
I could have typed:
gen str10 v201_1=""
replace v201_1=c2 if c1=="201"
gen str10 v201_2=""
replace v201_2=c3 if c1=="201"
gen str10 v201_12=""
replace v201_12=c13 if c1=="201"
Continue reading Repetitive tasks … let STATA do the work
One huge disadvantage of NetQuestionnaire (NetQ) is that randomisation is only possible for the order of sub-items of questions. There is, however, a way to use Stata to make anything random (e.g. the order of questions, content of questions). Below, I describe a way how to generate randomly sets of vignettes. At the end of the do-file, text is saved to an Excel-file). Merged with an address-list, this information can be imported to NetQ, and used in questions using the NetQ-variables. Continue reading Using Stata to randomise vignettes in NetQuestionnaire
The following command can be used to generate a command which consists of several new variables which are generated within a loop.
This could either be done by (e.g. generating a number of log variables)
gen newvar1 = log(var1) Continue reading Using locals and loops to generate long strings
gen newvar2 = log(var2)