IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v54y2010i6p1516-1524.html
   My bibliography  Save this article

Binary trees for dissimilarity data

Author

Listed:
  • Piccarreta, Raffaella

Abstract

Binary segmentation procedures (in particular, classification and regression trees) are extended to study the relation between dissimilarity data and a set of explanatory variables. The proposed split criterion is very flexible, and can be applied to a wide range of data (e.g., mixed types of multiple responses, longitudinal data, sequence data). Also, it can be shown to be an extension of well-established criteria introduced in the literature on binary trees.

Suggested Citation

  • Piccarreta, Raffaella, 2010. "Binary trees for dissimilarity data," Computational Statistics & Data Analysis, Elsevier, vol. 54(6), pages 1516-1524, June.
  • Handle: RePEc:eee:csdana:v:54:y:2010:i:6:p:1516-1524
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(09)00463-0
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Cees H. Elzinga, 2005. "Combinatorial Representations of Token Sequences," Journal of Classification, Springer;The Classification Society, vol. 22(1), pages 87-118, June.
    2. Dine, Abdessamad & Larocque, Denis & Bellavance, François, 2009. "Multivariate trees for mixed outcomes," Computational Statistics & Data Analysis, Elsevier, vol. 53(11), pages 3795-3804, September.
    3. Pierpaolo D’Urso, 2000. "Dissimilarity measures for time trajectories," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 9(1), pages 53-83, January.
    4. Raffaella Piccarreta & Francesco C. Billari, 2007. "Clustering work and family trajectories by using a divisive algorithm," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(4), pages 1061-1078, October.
    5. Henk Kiers & Donatella Vicari & Maurizio Vichi, 2005. "Simultaneous classification and multidimensional scaling with external information," Psychometrika, Springer;The Psychometric Society, vol. 70(3), pages 433-460, September.
    6. Kim, Ji-Hyun, 2009. "Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap," Computational Statistics & Data Analysis, Elsevier, vol. 53(11), pages 3735-3745, September.
    7. David R. Larsen & Paul L. Speckman, 2004. "Multivariate Regression Trees for Analysis of Abundance Data," Biometrics, The International Biometric Society, vol. 60(2), pages 543-549, June.
    8. Sexton, Joseph & Laake, Petter, 2009. "Standard errors for bagged and random forest estimators," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 801-811, January.
    9. Briand, Bénédicte & Ducharme, Gilles R. & Parache, Vanessa & Mercat-Rommens, Catherine, 2009. "A similarity measure to assess the stability of classification trees," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1208-1217, February.
    10. Siciliano, Roberta & Mola, Francesco, 2000. "Multivariate data analysis and modeling through classification and regression trees," Computational Statistics & Data Analysis, Elsevier, vol. 32(3-4), pages 285-301, January.
    11. Duncan McVicar & Michael Anyadike‐Danes, 2002. "Predicting successful and unsuccessful transitions from school to work by using sequence methods," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 165(2), pages 317-334, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Antonella Plaia & Simona Buscemi & Johannes Fürnkranz & Eneldo Loza Mencía, 2022. "Comparing Boosting and Bagging for Decision Trees of Rankings," Journal of Classification, Springer;The Classification Society, vol. 39(1), pages 78-99, March.
    2. Antonella Plaia & Mariangela Sciandra, 2019. "Weighted distance-based trees for ranking data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(2), pages 427-444, June.
    3. Matthias Studer & Gilbert Ritschard & Alexis Gabadinho & Nicolas S. Müller, 2011. "Discrepancy Analysis of State Sequences," Sociological Methods & Research, , vol. 40(3), pages 471-510, August.
    4. Marco Bonetti & Raffaella Piccarreta & Gaia Salford, 2013. "Parametric and Nonparametric Analysis of Life Courses: An Application to Family Formation Patterns," Demography, Springer;Population Association of America (PAA), vol. 50(3), pages 881-902, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marco Bonetti & Raffaella Piccarreta & Gaia Salford, 2013. "Parametric and Nonparametric Analysis of Life Courses: An Application to Family Formation Patterns," Demography, Springer;Population Association of America (PAA), vol. 50(3), pages 881-902, June.
    2. Schmid, Lena & Gerharz, Alexander & Groll, Andreas & Pauly, Markus, 2023. "Tree-based ensembles for multi-output regression: Comparing multivariate approaches with separate univariate ones," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    3. Cees H. Elzinga, 2010. "Complexity of Categorical Time Series," Sociological Methods & Research, , vol. 38(3), pages 463-481, February.
    4. Brendan Halpin, 2010. "Optimal Matching Analysis and Life-Course Data: The Importance of Duration," Sociological Methods & Research, , vol. 38(3), pages 365-388, February.
    5. Michael Anyadike-Danes & Duncan McVicar, 2010. "My Brilliant Career: Characterizing the Early Labor Market Trajectories of British Women From Generation X," Sociological Methods & Research, , vol. 38(3), pages 482-512, February.
    6. Serah Shin & Hyungsoo Kim, 2018. "Health Trajectories of Older Americans and Medical Expenses: Evidence from the Health and Retirement Study Data Over the 18 Year Period," Journal of Family and Economic Issues, Springer, vol. 39(1), pages 19-33, March.
    7. Raffaella Piccarreta, 2012. "Graphical and Smoothing Techniques for Sequence Analysis," Sociological Methods & Research, , vol. 41(2), pages 362-380, May.
    8. Antonio D’Ambrosio & Willem J. Heiser, 2016. "A Recursive Partitioning Method for the Prediction of Preference Rankings Based Upon Kemeny Distances," Psychometrika, Springer;The Psychometric Society, vol. 81(3), pages 774-794, September.
    9. Piccarreta, Raffaella & Bonetti, Marco, 2019. "Assessing and comparing models for sequence data by microsimulation (with Supplementary Material)," SocArXiv 3mcfp, Center for Open Science.
    10. Raffaella Piccarreta & Orna Lior, 2010. "Exploring sequences: a graphical tool based on multi‐dimensional scaling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 173(1), pages 165-184, January.
    11. Nicola Barban, 2013. "Family Trajectories and Health: A Life Course Perspective [Trajectoires familiales et santé: une approche sous l’angle de parcours de vie]," European Journal of Population, Springer;European Association for Population Studies, vol. 29(4), pages 357-385, November.
    12. Marco Raffaella Piccarreta & Marco Bonetti & Stefano Lombardi, 2018. "Comparing models for sequence data: prediction and dissimilarities," Working Papers 113, "Carlo F. Dondena" Centre for Research on Social Dynamics (DONDENA), Università Commerciale Luigi Bocconi.
    13. Christian Brzinsky-Fay & Ulrich Kohler, 2010. "New Developments in Sequence Analysis," Sociological Methods & Research, , vol. 38(3), pages 359-364, February.
    14. Nicholas Longford & Ioana C. Salagean, 2013. "A study of the labour market trajectories in the Grand Duchy of Luxembourg," Economics Working Papers 1396, Department of Economics and Business, Universitat Pompeu Fabra.
    15. Hsiao, Wei-Cheng & Shih, Yu-Shan, 2007. "Splitting variable selection for multivariate regression trees," Statistics & Probability Letters, Elsevier, vol. 77(3), pages 265-271, February.
    16. Patrick Krennmair & Timo Schmid, 2022. "Flexible domain prediction using mixed effects random forests," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1865-1894, November.
    17. Júlia Mikolai & Hill Kulu, 2019. "Union dissolution and housing trajectories in Britain," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 41(7), pages 161-196.
    18. repec:jss:jstsof:40:i04 is not listed on IDEAS
    19. Moehring, Katja & Weiland, Andreas & Reifenscheid, Maximiliane & Naumann, Elias & Wenz, Alexander & Rettig, Tobias & Krieger, Ulrich & Fikel, Marina & Cornesse, Carina & Blom, Annelies G., 2021. "Inequality in employment trajectories and their socio-economic consequences during the early phase of the COVID-19 pandemic in Germany," SocArXiv m95df, Center for Open Science.
    20. Mark G E White & Neil E Bezodis & Jonathon Neville & Huw Summers & Paul Rees, 2022. "Determining jumping performance from a single body-worn accelerometer using machine learning," PLOS ONE, Public Library of Science, vol. 17(2), pages 1-25, February.
    21. Biewen, Martin & Kugler, Philipp, 2021. "Two-stage least squares random forests with an application to Angrist and Evans (1998)," Economics Letters, Elsevier, vol. 204(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:54:y:2010:i:6:p:1516-1524. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.