SOEPlong: How to restructure complex longitudinal survey data (an application for the German Socio-Economic Panel study)
Currently, we observe in the social and behavioral sciences an increasing demand on complex longitudinal household survey data for national and cross-national analyses. The state of the art (for national as well as international comparative data collections) provides two types of solutions: either the full presentation of all original wave-specific variables over time or the creation of fixed variables according to common time-consistent standards. The first type of solution leaves it to the researcher to choose how to encapsulate differing categories over time, and thus, it is rather time-consuming. The second type of solution is very easy to use; however, it does not provide the user with information on possibly necessary annual extensions or modifications for specific years. In both cases, the researcher has no further information on potential changes of variables over time. This paper addresses the topic of how complex representative longitudinal data can be disseminated for analyses in the social and behavioral sciences such that the amount of time for data preparation is reduced to a minimum while information on consistency and changes of variables over time remains fully available. It turns out that if we want to monitor changes in living conditions by permanent, regular observations using panel surveys, adaptations in variables seem to be the rule rather than the exception. Therefore, our solution for the restructuring of longitudinal data fulfils the requirements of permanently ongoing adaptations in variables as a reflection of adapted measures according to new social conditions, new theoretical backgrounds, or improved conceptual measures when monitoring changes in living conditions directly over time. Using Stata, we provide a conceptual and technical solution for how to restructure the full set of SOEP variables with a complete documentation of all adaptations over time. Our Stata programs generate two output files: one covering the restructured data and another one for the full documentation on the consistency of the variables over all waves. SOEPlong has been released in 2010 for the first time as a beta version, together with the usual data dissemination on DVD for the full set of SOEP variables for 26 waves of data. While the paper is specifically addressed to the German Socio-Economic Panel (SOEP) study, our general approach on how to deal with complex household panel data might well be applied to other national and cross-national longitudinal household surveys.
When requesting a correction, please mention this item's handle: RePEc:boc:dsug11:09. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Christopher F Baum)
If references are entirely missing, you can add them using this form.