Useful string functions in Stata (updated list)

Most often when I search the internet for help on Stata, it is probably when I need to work with string variables (such as names). There are some very good summaries that cover aspects of string variables (e.g., this page). In this post –which will be continuously updated– we present random string functions that we think are extremely useful for Stata users.

Continue reading “Useful string functions in Stata (updated list)”

Remove blanks from string variables in Stata

Identify and remove blank space in Stata’s string variables

When dealing with string variables in Stata, blanks spaces can make it difficult to identify values. For example, if a variable contains " Arizona", a command that contains an if command such as ... if state="Arizona" won’t detect this observation.

Continue reading “Remove blanks from string variables in Stata”

Combining strings into one variable

I’m again playing around with strings in Stata and need to combine (string) variables with (string) variables as well as strings in locals with string variables. Suppose you have two string variables (strvar1 and strvar2) and want to combine them, you can simply type

Continue reading “Combining strings into one variable”

Using question marks and stars in strings

I just came accross the following problem: suppose you have a string variable (stringvar) which contains text and question marks. Question marks are usually used as wildcards for single letters or numbers. So, when you want to apply some changes to the variable (here: removing the “?”), you should NOT type Continue reading “Using question marks and stars in strings”

Cleaning up messy (string) variables

Working on firm level data (again), I have the experience of cleaning up hundreds of different spelinngs of occupations that should eventually be categorized into a set of occupations that should only differ when actual different occupations are needed.Let me call the variable occupation.

34. slesar po remontu la
38. slesar po rem. la
44. slesar po rem. i obsluzh. vent. i kondicionirovaniya
54. slesar po rem. i obsluzh. ven. i kondicionirovaniya
146. slesar po rem. la
205. slesar po remontu agregatov
259. slesar po rem.agregatov
313. slesar po remontu kompressornyh ustanovok i oborudovaniya
343. slesar po remontu oborud

Wonderful mess, acutally only a minor part of the full data-set.
Continue reading “Cleaning up messy (string) variables”