When writing a lot of do-files during a research process it is hard to keep track of what a do-file was for, what it needs in terms of input, and what it generates in terms of output. Especially, if you get your paper back from the (journal) referees with comments what you should change, and want to re-run some part of the analysis — a year after you have done it –, it is hard to remember exactly what you need to do.
I use a preamble in my do-files to document (somewhat) this information, but also to set a couple of standard pointers that make my work easier …
Here is a typical (beginning of a) do-file:
* alt_prep1999.do ==> Name of the do-file
* prepare data ==> purpose of the do-file
* needs: SSB banen (original) ==> input for do-file
* generates: reos$jaar.dta; uurloon$jaar.dta ==> output of do-file
* Ben Kriechel
* Lex Borghans
* old version: 2006.05.04
* this version: 2007.03.29
version 9.0 /* ==> this insures that commands can be used as in STATA 9 */
set more off
* set virtual on
set mem 1000m
In the comments part I write the name of the do-file, the input (data) the output (data), the purpose, who has worked on it, and when did I start, what the current version is.
Then I invoke the version command of Stata that insures that your (old) do-files give the same results on Stata 10 as in Stata 8 if you use version 8.0 .
Finally, I clear the memory, assign 1GB of memory for STATA, and set some data-directories and files. I use global macros for that, i.e. the command global allows me to assign the data directory to $data . It is just like local macros only here it keeps this even across do-files.
So dir $data\ would give me the contents of the data directory.