Log-files are important in the workflow for two reasons:
- Most importantly they keep track of any messages that are “non-fatal”, i.e. that do not stop the progress of the do-file. However, quite often you want to ignore those messages, unless you expect an error to have occurred, then you search through your log files.
- Convenient is to use the log-file to collect only that part of the output (results) that you will actually need for your research project.
Within Stata there are two types of log files. Command logs — cmdlog using … — and (result) logs — log using … — . The latter is what I had in mind when I gave the two points above.
Within the process of preparing the data it is convenient to have a log file catching all commands and output that are given. You don’t need to look through them all the time, but it helps to check the log-files when “something happened”.
While most people will probably say … well, I’ll just run the file once more, and see where it stops or has problems , I can tell you that log files saved me hours of work, or rather the computer. In more complicated data-preparations, where the computer runs for hours, it is always faster to have a log-file to check for errors, than to re-run it. And the convenient feature of a log file is that it records like a black-box until the computer or the program dies. Fortunately the latter happens a little more often than the former, and that is when log-files used to be handy for me. Stata tended to crash somewhere in my do-file, and pinpointing the exact position was facilitated by looking in the log-file.
So, my data-preparation files always (when I remember to be a good boy) have the following:
capture log close
log using data_prep_project.log, replace
[ … data preparation is here ]
The capture log close closes all open log files, otherwise Stata would exit with an error. The replace option in the log command allows Stata to overwrite an existing log-file with the new version. Convenient if you rerun your do-file several times.
For point 2, catching my results, I use a slightly different strategy. Here I open a log file, then close it after one result is caught, and append it only with the necessary output.
log using a_cox_1stjob.log, replace
tab tencat, generate(tenure_c)
tab agecat, generate(age_c)
tab s_nivo, gen(edu_c)
tab vestig, gen(vest_c)
tab layer, g(layer_c)
tab beo, g(eval_c)
stset date, origin(time 13223) exit(time 14340) id(birc)
stcox vest_c2 vest_c3 female married `agegrp’ `tengrp’ `educat’ `actgeb’ `layer’, bases(s)
outreg using cox.out, replace
matrix list beta
matrix list length
log using a_cox_1stjob.log, append
stset date, origin(time 13223) exit(time 14340)
stcox vest_c2 vest_c3 female married `agegrp’ `tengrp’ `educat’ `actgeb’ `layer’ `eval’ wao, bases(s)
So, as you can see in this example, I use one “result log file” in which I write those results that I want to report in my paper.
When do I use cmdlog? Well, I should not use it, but sometimes I do: when I am “trying to do something quickly with the data” [which I shouldn’t] then I am working with the commandline in Stata [which I shouldn’t]. In order to keep, whatever I have done in that process, you should have the cmdlog running while you are playing around. That way all your commands are captured in a log-file and you can build a do-file [which you should] out of your quick-and-dirty command line work [which you should not do].