IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v183y2020i3p1231-1251.html
   My bibliography  Save this article

Model‐based clustering and analysis of life history data

Author

Listed:
  • Marc A. Scott
  • Kaushik Mohan
  • Jacques‐Antoine Gauthier

Abstract

Methods and models for longitudinal data with categorical, multi‐dimensional outcomes are quite limited, but they are essential to the study of life histories. For example, in the Swiss Household Panel, information on the co‐residence and professional status of several thousand individuals is available through to age 45 years. Interest centres on the time and order of life course events such as having children and working full or part time and the duration of the phases that they delineate. With data of this type, optimal matching and clustering algorithms relying on a distance metric or parametric models of duration in a competing risks framework are used; the appropriateness of each derives from competing goals and orientation. We prefer model‐based approaches when certain goals are paramount: simulation of individual trajectories; adjusting for time‐dependent covariates; handling multistate trajectories and missing outcomes. Several of these goals are particularly challenging when the number of states is of moderate size, and many transitions are infrequent and/or time inhomogeneous. Using the Swiss Household Panel, we demonstrate the appropriateness of latent class growth curve models for analysing sequence data. In particular, models including heterogeneous dependence structure provide new techniques for assessing goodness of fit as well as yield insights into social processes.

Suggested Citation

  • Marc A. Scott & Kaushik Mohan & Jacques‐Antoine Gauthier, 2020. "Model‐based clustering and analysis of life history data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(3), pages 1231-1251, June.
  • Handle: RePEc:bla:jorssa:v:183:y:2020:i:3:p:1231-1251
    DOI: 10.1111/rssa.12575
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssa.12575
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssa.12575?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Nicola Barban & Francesco C. Billari, 2012. "Classifying life course trajectories: a comparison of latent class and sequence analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 61(5), pages 765-784, November.
    2. Matthias Studer & Gilbert Ritschard, 2016. "What matters in differences between life trajectories: a comparative review of sequence dissimilarity measures," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 179(2), pages 481-511, February.
    3. Gabadinho, Alexis & Ritschard, Gilbert & Müller, Nicolas S & Studer, Matthias, 2011. "Analyzing and Visualizing State Sequences in R with TraMineR," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i04).
    4. Leonard J. Paas & Jeroen K. Vermunt & Tammo H. A. Bijmolt, 2007. "Discrete time, discrete state latent Markov modelling for assessing and predicting household acquisitions of financial products," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(4), pages 955-974, October.
    5. Grün, Bettina & Leisch, Friedrich, 2008. "FlexMix Version 2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 28(i04).
    6. Paas, L.J. & Vermunt, J.K. & Bijmolt, T.H.A., 2007. "Discrete-time discrete-state latent Markov modelling for assessing and predicting household acquisitions of financial products," Other publications TiSEM 5781ab33-6687-4ad5-b57a-3, Tilburg University, School of Economics and Management.
    7. Marc A. Scott, 2011. "Affinity models for career sequences," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 60(3), pages 417-436, May.
    8. Leisch, Friedrich, 2004. "FlexMix: A General Framework for Finite Mixture Models and Latent Class Regression in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 11(i08).
    9. Dehnert, M. & Helm, W.E. & Hütt, M.-Th., 2003. "A discrete autoregressive process as a model for short-range correlations in DNA sequences," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 327(3), pages 535-553.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Marc A. Scott & Jean-Marie Goff & Jacques-Antoine Gauthier, 2024. "History matters: the statistical modelling of the life course," Quality & Quantity: International Journal of Methodology, Springer, vol. 58(1), pages 445-469, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marc A. Scott & Jean-Marie Goff & Jacques-Antoine Gauthier, 2024. "History matters: the statistical modelling of the life course," Quality & Quantity: International Journal of Methodology, Springer, vol. 58(1), pages 445-469, February.
    2. Elsenburg, Leonie K. & Rieckmann, Andreas & Bengtsson, Jessica & Jensen, Andreas Kryger & Rod, Naja Hulvej, 2024. "Application of life course trajectory methods to public health data: A comparison of sequence analysis and group-based multi-trajectory modeling for modelling childhood adversity trajectories," Social Science & Medicine, Elsevier, vol. 340(C).
    3. Liao, Tim F. & Bolano, Danilo & Brzinsky-Fay, Christian & Cornwell, Benjamin & Fasang, Anette Eva & Helske, Satu & Piccarreta, Raffaella & Raab, Marcel & Ritschard, Gilbert & Struffolino, Emanuela & S, 2022. "Sequence analysis: Its past, present, and future," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 107, pages 1-1.
    4. Marcel Raab & Emanuela Struffolino, 2020. "The Heterogeneity of Partnership Trajectories to Childlessness in Germany," European Journal of Population, Springer;European Association for Population Studies, vol. 36(1), pages 53-70, March.
    5. Julia Mikolai & Hill Kulu, 2019. "Union dissolution and housing trajectories in Britain," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 41(7), pages 161-196.
    6. Christian Kleiber & Achim Zeileis, 2016. "Visualizing Count Data Regressions Using Rootograms," The American Statistician, Taylor & Francis Journals, vol. 70(3), pages 296-303, July.
    7. Lebret, Rémi & Iovleff, Serge & Langrognet, Florent & Biernacki, Christophe & Celeux, Gilles & Govaert, Gérard, 2015. "Rmixmod: The R Package of the Model-Based Unsupervised, Supervised, and Semi-Supervised Classification Mixmod Library," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 67(i06).
    8. Grün, Bettina & Kosmidis, Ioannis & Zeileis, Achim, 2012. "Extended Beta Regression in R: Shaken, Stirred, Mixed, and Partitioned," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 48(i11).
    9. Wang, Po-Chieh & Hsu, Yu-Ting & Hsu, Chia-Wei, 2021. "Analysis of waiting time perception of bus passengers provided with mobile service," Transportation Research Part A: Policy and Practice, Elsevier, vol. 145(C), pages 319-336.
    10. Roberto Mari & Salvatore Ingrassia & Antonio Punzo, 2023. "Local and Overall Deviance R-Squared Measures for Mixtures of Generalized Linear Models," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 233-266, July.
    11. Babette Bühler & Katja Möhring & Andreas P. Weiland, 2022. "Assessing dissimilarity of employment history information from survey and administrative data using sequence analysis techniques," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(6), pages 4747-4774, December.
    12. Estelle McLean & Amelia C Crampin & Rebecca Sear & Maria Sironi & Emma Slaymaker & Albert Dube, 2024. "Transitions to adulthood in men and women in rural Malawi in the 21st century using sequence analysis: Some evidence of delay," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 51(14), pages 459-500.
    13. Olga Czeranowska & Dominika Winogrodzka, 2024. "Socio-occupational Paths of Polish and Lithuanian Returning Migrants: Sequence Analysis of Survey Data with the Use of TraMineR for R," Journal of International Migration and Integration, Springer, vol. 25(2), pages 997-1025, June.
    14. Montorsi, Carlotta & Fusco, Alessio & Van Kerm, Philippe & Bordas, Stéphane P.A., 2024. "Predicting depression in old age: Combining life course data with machine learning," Economics & Human Biology, Elsevier, vol. 52(C).
    15. Nicolas Städler & Peter Bühlmann & Sara Geer, 2010. "ℓ 1 -penalization for mixture regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 19(2), pages 209-256, August.
    16. repec:osf:socarx:3mcfp_v1 is not listed on IDEAS
    17. Devillanova, Carlo & Raitano, Michele & Struffolino, Emanuela, 2019. "Longitudinal employment trajectories and health in middle life: Insights from linked administrative and survey data," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 40, pages 1375-1412.
    18. David Plavcan & Georg J. Mayr & Achim Zeileis, 2013. "Automatic and Probabilistic Foehn Diagnosis with a Statistical Mixture Model," Working Papers 2013-22, Faculty of Economics and Statistics, Universität Innsbruck.
    19. Piccarreta, Raffaella & Bonetti, Marco, 2019. "Assessing and comparing models for sequence data by microsimulation (with Supplementary Material)," SocArXiv 3mcfp, Center for Open Science.
    20. Ian Wadsworth & Lisa V. Hampson & Thomas Jaki & Graeme J. Sills & Anthony G. Marson & Richard Appleton, 2020. "A quantitative framework to inform extrapolation decisions in children," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(2), pages 515-534, February.
    21. Utkarsh J. Dang & Antonio Punzo & Paul D. McNicholas & Salvatore Ingrassia & Ryan P. Browne, 2017. "Multivariate Response and Parsimony for Gaussian Cluster-Weighted Models," Journal of Classification, Springer;The Classification Society, vol. 34(1), pages 4-34, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:183:y:2020:i:3:p:1231-1251. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.