When dealing with string variables in Stata, blanks spaces can make it difficult to identify values. For example, if a variable contains
" Arizona", a command that contains an if command such as
... if state="Arizona" won’t detect this observation.
For certain cleaning jobs, it can be useful to duplicate an observation (often only temporarily). To create a identical copy of an observation, just type
I just ran across a very useful ado.-file called use13 which allows to open files in Stata 12 which were originally saved in Stata 13 (which often results in the error code “r(610): file not Stata format”). use13 can be installed by typing
ssc install use13
Despite most sources tell that the storage type in stata should not matter, it is worth checking whether this is the case for your dataset. I just came across a situation where two identically constructed datasets (one stored in default type (float) and one stored in double) generated different output. Also before that i encountered a problem with person identifiers in the GSOEP if using the default data storage. If your dataset is not huge (with the GSOEP it still works quite ok) it might be worth to take the safe side and use
set type double
before you assemble your data set. This saves the data in the most precise way stata offers.
A problem when working on one and the same project on different platforms (here: Windows and Mac/OS X) is that path-names differ. There are two straightforward solutions to this:
1) When defining a number of different path (e.g. one path where data is stored, one where results/output is stored), it is handy to define the paths as globals and to add an “if” condition. The platform can be detected by the local `c(os)’: Continue reading
Most of you know that I don’t like to have SPSS on my computer, let alone use it. Over time I had several versions of SPSS installed nevertheless. And that is because SPSS allowed me to open a SPSS file and then save it in … STATA format. Now there is no reason to do that anymore, and that is not due to StatTransfer!
When cleaning datasets one often has string variables containing categories (e.g. country names). A simple way of transforming such a variable to one variable containing the same information is
encode. Encode assigns numerical values 1, 2, … to
newvar, while the original values (e.g. country names) are kept as labels. Continue reading
Working scientifically with statistics software implies that the analysis one performs should be done using batch-files, in STATA terms using do-files. This is important so that results can be reproduced, and if errors are found, the analysis can be run anew. I have been using a set-up in which I divide the empirical research in several steps, that allow me to reproduce my steps, save time, and insure that I can easily back-up my research without too much of a hassle.
For those of you working with spell-data (could be panel, but I am thinking more of event-history data), there is a great tool that you should be aware of. You can get it at SSC by typing the following command:
ssc install tsspell
What can it do for you? Well, as I said it puts a spell on your data. By giving the command
Once more working on some standarized data to generate some standard output and graphs in * yuck * Excel. Well, we have to learn to live with the fact that some people and organizations don’t want STATA graphs and tables. So we have to export to Excel. In my case this were several tables and cross tables that were then linked to Excel Graphs. So far I did this with copy-pasting or using StatTransfer, today I found a different way …