Log-files are important in the workflow for two reasons:

  1. Most importantly they keep track of any messages that are “non-fatal”, i.e. that do not stop the progress of the do-file. However, quite often you want to ignore those messages, unless you expect an error to have occurred, then you search through your log files.
  2. Convenient is to use the log-file to collect only that part of the output (results) that you will actually need for your research project.

Within Stata there are two types of log files. Command logs — cmdlog using … — and (result) logs — log using … — . The latter is what I had in mind when I gave the two points above.

Within the process of preparing the data it is convenient to have a log file catching all commands and output that are given. You don’t need to look through them all the time, but it helps to check the log-files when “something happened”.

While most people will probably say … well, I’ll just run the file once more, and see where it stops or has problems , I can tell you that log files saved me hours of work, or rather the computer. In more complicated data-preparations, where the computer runs for hours, it is always faster to have a log-file to check for errors, than to re-run it. And the convenient feature of a log file is that it records like a black-box until the computer or the program dies. Fortunately the latter happens a little more often than the former, and that is when log-files used to be handy for me. Stata tended to crash somewhere in my do-file, and pinpointing the exact position was facilitated by looking in the log-file.

So, my data-preparation files always (when I remember to be a good boy) have the following:

capture log close
log using data_prep_project.log, replace

[ … data preparation is here ]
log close

The capture log close closes all open log files, otherwise Stata would exit with an error. The replace option in the log command allows Stata to overwrite an existing log-file with the new version. Convenient if you rerun your do-file several times.

For point 2, catching my results, I use a slightly different strategy. Here I open a log file, then close it after one result is caught, and append it only with the necessary output.

log using a_cox_1stjob.log, replace
tab tencat, generate(tenure_c)
tab agecat, generate(age_c)
tab s_nivo, gen(edu_c)
tab vestig, gen(vest_c)
tab layer, g(layer_c)
tab beo, g(eval_c)
stset date, origin(time 13223) exit(time 14340) id(birc)
stcox vest_c2 vest_c3 female married `agegrp’ `tengrp’ `educat’ `actgeb’ `layer’, bases(s)
log close

outreg using cox.out, replace
log close
matrix beta=e(b)
matrix list beta
matrix median=J(37,1,0)
matrix length=beta*median
matrix list length
log using a_cox_1stjob.log, append
stset date, origin(time 13223) exit(time 14340)
stcox vest_c2 vest_c3 female married `agegrp’ `tengrp’ `educat’ `actgeb’ `layer’ `eval’ wao, bases(s)
log close

So, as you can see in this example, I use one “result log file” in which I write those results that I want to report in my paper.

When do I use cmdlog? Well, I should not use it, but sometimes I do: when I am “trying to do something quickly with the data” [which I shouldn’t] then I am working with the commandline in Stata [which I shouldn’t]. In order to keep, whatever I have done in that process, you should have the cmdlog running while you are playing around. That way all your commands are captured in a log-file and you can build a do-file [which you should] out of your quick-and-dirty command line work [which you should not do].

Leave a Reply