IDEAS home Printed from https://ideas.repec.org/a/spr/eurpop/v40y2024i1d10.1007_s10680-023-09682-3.html
   My bibliography  Save this article

Identifying and Qualifying Deviant Cases in Clusters of Sequences: The Why and The How

Author

Listed:
  • Raffaella Piccarreta

    (Bocconi University)

  • Emanuela Struffolino

    (University of Milan)

Abstract

Sequence analysis is employed in different fields—e.g., demography, sociology, and political sciences—to describe longitudinal processes represented as sequences of categorical states. In many applications, sequences are clustered to identify relevant types, which reflect the different empirical realisations of the temporal process under study. We explore criteria to inspect internal cluster composition and to detect deviant sequences, that is, cases characterised by rare patterns or outliers that might compromise cluster homogeneity. We also introduce tools to visualise and distinguish the features of regular and deviant cases. Our proposals offer a more accurate and granular description of the data structure, by identifying—besides the most typical types—peculiar sequences that might be interesting from a substantive and theoretical point of view. This analysis could be very useful in applications where—under the assumption of within homogeneity—clusters are used as outcome or explanatory variables in regressions. We demonstrate the added value of our proposal in a motivating application from life-course socio-demography, focusing on Italian women’s employment trajectories and on their link with their mothers’ participation in the labour market across geographical areas.

Suggested Citation

  • Raffaella Piccarreta & Emanuela Struffolino, 2024. "Identifying and Qualifying Deviant Cases in Clusters of Sequences: The Why and The How," European Journal of Population, Springer;European Association for Population Studies, vol. 40(1), pages 1-19, December.
  • Handle: RePEc:spr:eurpop:v:40:y:2024:i:1:d:10.1007_s10680-023-09682-3
    DOI: 10.1007/s10680-023-09682-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10680-023-09682-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10680-023-09682-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Raffaella Piccarreta & Orna Lior, 2010. "Exploring sequences: a graphical tool based on multi‐dimensional scaling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 173(1), pages 165-184, January.
    2. Raitano Michele & Vona Francesco, 2018. "From the Cradle to the Grave: The Influence of Family Background on the Career Path of Italian Men," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 80(6), pages 1062-1088, December.
    3. repec:hal:spmain:info:hdl:2441/7d426vdmrr8am8khcm1fvu5adl is not listed on IDEAS
    4. Fasang, Anette Eva & Liao, Tim Futing, 2014. "Visualizing Sequences in the Social Sciences: Relative Frequency Sequence Plots," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 43(4), pages 643-676.
    5. Andrew Abbott, 1990. "A Primer on Sequence Methods," Organization Science, INFORMS, vol. 1(4), pages 375-392, November.
    6. Hahsler, Michael & Hornik, Kurt & Buchta, Christian, 2008. "Getting Things in Order: An Introduction to the R Package seriation," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 25(i03).
    7. repec:spo:wpmain:info:hdl:2441/7d426vdmrr8am8khcm1fvu5adl is not listed on IDEAS
    8. Kathleen L McGinn & Mayra Ruiz Castro & Elizabeth Long Lingo, 2019. "Learning from Mum: Cross-National Evidence Linking Maternal Employment and Adult Children’s Outcomes," Work, Employment & Society, British Sociological Association, vol. 33(3), pages 374-400, June.
    9. Giorgio Di Pietro & Peter Urwin, 2003. "Intergenerational mobility and occupational status in Italy," Applied Economics Letters, Taylor & Francis Journals, vol. 10(12), pages 793-797.
    10. Liao, Tim F. & Bolano, Danilo & Brzinsky-Fay, Christian & Cornwell, Benjamin & Fasang, Anette Eva & Helske, Satu & Piccarreta, Raffaella & Raab, Marcel & Ritschard, Gilbert & Struffolino, Emanuela & S, 2022. "Sequence analysis: Its past, present, and future," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 107, pages 1-1.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Devillanova, Carlo & Raitano, Michele & Struffolino, Emanuela, 2019. "Longitudinal employment trajectories and health in middle life: Insights from linked administrative and survey data," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 40, pages 1375-1412.
    2. Marika Jalovaara & Anette Fasang, 2017. "From never partnered to serial cohabitors: Union trajectories to childlessness," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 36(55), pages 1703-1720.
    3. Piccarreta, Raffaella & Struffolino, Emanuela, 2019. "An Integrated Heuristic for Validation in Sequence Analysis," SocArXiv v7mj8, Center for Open Science.
    4. Raffaella Piccarreta, 2017. "Joint Sequence Analysis," Sociological Methods & Research, , vol. 46(2), pages 252-287, March.
    5. Liao, Tim F. & Bolano, Danilo & Brzinsky-Fay, Christian & Cornwell, Benjamin & Fasang, Anette Eva & Helske, Satu & Piccarreta, Raffaella & Raab, Marcel & Ritschard, Gilbert & Struffolino, Emanuela & S, 2022. "Sequence analysis: Its past, present, and future," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 107, pages 1-1.
    6. repec:osf:socarx:3mcfp_v1 is not listed on IDEAS
    7. Piccarreta, Raffaella & Bonetti, Marco, 2019. "Assessing and comparing models for sequence data by microsimulation (with Supplementary Material)," SocArXiv 3mcfp, Center for Open Science.
    8. Jalovaara, Marika & Fasang, Anette Eva, 2017. "From never partnered to serial cohabitors: union trajectories to childlessness," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 36, pages 1703-1720.
    9. repec:osf:socarx:v7mj8_v1 is not listed on IDEAS
    10. Rannveig Kaldager Hart, 2019. "Union Histories of Dissolution: What Can They Say About Childlessness?," European Journal of Population, Springer;European Association for Population Studies, vol. 35(1), pages 101-131, February.
    11. Wu, Han-Ming & Tien, Yin-Jing & Chen, Chun-houh, 2010. "GAP: A graphical environment for matrix visualization and cluster analysis," Computational Statistics & Data Analysis, Elsevier, vol. 54(3), pages 767-778, March.
    12. Maciej Jagódka & Małgorzata Snarska, 2021. "The State of Human Capital and Innovativeness of Polish Voivodships in 2004–2018," Sustainability, MDPI, vol. 13(22), pages 1-20, November.
    13. Lim, Misun & Samper Mejia, Cristina, 2024. "Race and Cohort Differences in Family Status in the United States," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 10, pages 1-4.
    14. Bamieh, Omar & Cintolesi, Andrea, 2021. "Intergenerational transmission in regulated professions and the role of familism," Journal of Economic Behavior & Organization, Elsevier, vol. 192(C), pages 857-879.
    15. Raitano, Michele & Vona, Francesco, 2021. "Nepotism vs. Specific Skills: The effect of professional liberalization on returns to parental background of Italian lawyers," Journal of Economic Behavior & Organization, Elsevier, vol. 184(C), pages 489-505.
    16. Jacques-Antoine Gauthier & Eric D. Widmer & Philipp Bucher & Cédric Notredame, 2009. "How Much Does It Cost?," Sociological Methods & Research, , vol. 38(1), pages 197-231, August.
    17. Kamini Yadav & Hatim M. E. Geli, 2021. "Prediction of Crop Yield for New Mexico Based on Climate and Remote Sensing Data for the 1920–2019 Period," Land, MDPI, vol. 10(12), pages 1-27, December.
    18. Aliyev, Denis A. & Zirbel, Craig L., 2023. "Seriation using tree-penalized path length," European Journal of Operational Research, Elsevier, vol. 305(2), pages 617-629.
    19. Kevin J. Dooley & Andrew H. Van de Ven, 1999. "Explaining Complex Organizational Dynamics," Organization Science, INFORMS, vol. 10(3), pages 358-372, June.
    20. Salvatore Lo Bello & Iacopo Morchio, 2022. "Like father, like son: Occupational choice, intergenerational persistence and misallocation," Quantitative Economics, Econometric Society, vol. 13(2), pages 629-679, May.
    21. Pascual, Marta, 2009. "Intergenerational income mobility: The transmission of socio-economic status in Spain," Journal of Policy Modeling, Elsevier, vol. 31(6), pages 835-846, November.
    22. Van Winkle, Zachary & Fasang, Anette Eva, 2017. "Complexity in Employment Life Courses in Europe in the Twentieth Century—Large Cross-National Differences but Little Change across Birth Cohorts," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 96(1), pages 1-30.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:eurpop:v:40:y:2024:i:1:d:10.1007_s10680-023-09682-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.