IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v54y2010i6p1516-1524.html
   My bibliography  Save this article

Binary trees for dissimilarity data

Author

Listed:
  • Piccarreta, Raffaella

Abstract

Binary segmentation procedures (in particular, classification and regression trees) are extended to study the relation between dissimilarity data and a set of explanatory variables. The proposed split criterion is very flexible, and can be applied to a wide range of data (e.g., mixed types of multiple responses, longitudinal data, sequence data). Also, it can be shown to be an extension of well-established criteria introduced in the literature on binary trees.

Suggested Citation

  • Piccarreta, Raffaella, 2010. "Binary trees for dissimilarity data," Computational Statistics & Data Analysis, Elsevier, vol. 54(6), pages 1516-1524, June.
  • Handle: RePEc:eee:csdana:v:54:y:2010:i:6:p:1516-1524
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(09)00463-0
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David R. Larsen & Paul L. Speckman, 2004. "Multivariate Regression Trees for Analysis of Abundance Data," Biometrics, The International Biometric Society, vol. 60(2), pages 543-549, June.
    2. Sexton, Joseph & Laake, Petter, 2009. "Standard errors for bagged and random forest estimators," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 801-811, January.
    3. Briand, Bénédicte & Ducharme, Gilles R. & Parache, Vanessa & Mercat-Rommens, Catherine, 2009. "A similarity measure to assess the stability of classification trees," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1208-1217, February.
    4. Cees H. Elzinga, 2005. "Combinatorial Representations of Token Sequences," Journal of Classification, Springer;The Classification Society, vol. 22(1), pages 87-118, June.
    5. Dine, Abdessamad & Larocque, Denis & Bellavance, François, 2009. "Multivariate trees for mixed outcomes," Computational Statistics & Data Analysis, Elsevier, vol. 53(11), pages 3795-3804, September.
    6. Siciliano, Roberta & Mola, Francesco, 2000. "Multivariate data analysis and modeling through classification and regression trees," Computational Statistics & Data Analysis, Elsevier, vol. 32(3-4), pages 285-301, January.
    7. Pierpaolo D’Urso, 2000. "Dissimilarity measures for time trajectories," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 9(1), pages 53-83, January.
    8. Raffaella Piccarreta & Francesco C. Billari, 2007. "Clustering work and family trajectories by using a divisive algorithm," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(4), pages 1061-1078, October.
    9. Henk Kiers & Donatella Vicari & Maurizio Vichi, 2005. "Simultaneous classification and multidimensional scaling with external information," Psychometrika, Springer;The Psychometric Society, vol. 70(3), pages 433-460, September.
    10. Kim, Ji-Hyun, 2009. "Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap," Computational Statistics & Data Analysis, Elsevier, vol. 53(11), pages 3735-3745, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Marco Bonetti & Raffaella Piccarreta & Gaia Salford, 2013. "Parametric and Nonparametric Analysis of Life Courses: An Application to Family Formation Patterns," Demography, Springer;Population Association of America (PAA), vol. 50(3), pages 881-902, June.
    2. Antonella Plaia & Mariangela Sciandra, 2019. "Weighted distance-based trees for ranking data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(2), pages 427-444, June.
    3. Matthias Studer & Gilbert Ritschard & Alexis Gabadinho & Nicolas S. Müller, 2011. "Discrepancy Analysis of State Sequences," Sociological Methods & Research, , vol. 40(3), pages 471-510, August.
    4. Antonella Plaia & Simona Buscemi & Johannes Fürnkranz & Eneldo Loza Mencía, 2022. "Comparing Boosting and Bagging for Decision Trees of Rankings," Journal of Classification, Springer;The Classification Society, vol. 39(1), pages 78-99, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Schmid, Lena & Gerharz, Alexander & Groll, Andreas & Pauly, Markus, 2023. "Tree-based ensembles for multi-output regression: Comparing multivariate approaches with separate univariate ones," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    2. Marco Bonetti & Raffaella Piccarreta & Gaia Salford, 2013. "Parametric and Nonparametric Analysis of Life Courses: An Application to Family Formation Patterns," Demography, Springer;Population Association of America (PAA), vol. 50(3), pages 881-902, June.
    3. Antonio D’Ambrosio & Willem J. Heiser, 2016. "A Recursive Partitioning Method for the Prediction of Preference Rankings Based Upon Kemeny Distances," Psychometrika, Springer;The Psychometric Society, vol. 81(3), pages 774-794, September.
    4. Patrick Krennmair & Timo Schmid, 2022. "Flexible domain prediction using mixed effects random forests," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1865-1894, November.
    5. Brendan Halpin, 2010. "Optimal Matching Analysis and Life-Course Data: The Importance of Duration," Sociological Methods & Research, , vol. 38(3), pages 365-388, February.
    6. Biewen, Martin & Kugler, Philipp, 2021. "Two-stage least squares random forests with an application to Angrist and Evans (1998)," Economics Letters, Elsevier, vol. 204(C).
    7. DeSarbo, Wayne S. & Selin Atalay, A. & Blanchard, Simon J., 2009. "A three-way clusterwise multidimensional unfolding procedure for the spatial representation of context dependent preferences," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3217-3230, June.
    8. Arnstein Aassve & Francesco C. Billari & Raffaella Piccarreta, 2007. "Strings of Adulthood: A Sequence Analysis of Young British Women’s Work-Family Trajectories," European Journal of Population, Springer;European Association for Population Studies, vol. 23(3), pages 369-388, October.
    9. Airola, Antti & Pahikkala, Tapio & Waegeman, Willem & De Baets, Bernard & Salakoski, Tapio, 2011. "An experimental comparison of cross-validation techniques for estimating the area under the ROC curve," Computational Statistics & Data Analysis, Elsevier, vol. 55(4), pages 1828-1844, April.
    10. Michael Anyadike-Danes & Duncan McVicar, 2010. "My Brilliant Career: Characterizing the Early Labor Market Trajectories of British Women From Generation X," Sociological Methods & Research, , vol. 38(3), pages 482-512, February.
    11. David M. Ritzwoller & Vasilis Syrgkanis, 2024. "Simultaneous Inference for Local Structural Parameters with Random Forests," Papers 2405.07860, arXiv.org, revised Sep 2024.
    12. Coppi, Renato & D'Urso, Pierpaolo, 2006. "Fuzzy unsupervised classification of multivariate time trajectories with the Shannon entropy regularization," Computational Statistics & Data Analysis, Elsevier, vol. 50(6), pages 1452-1477, March.
    13. John J Nay & Yevgeniy Vorobeychik, 2016. "Predicting Human Cooperation," PLOS ONE, Public Library of Science, vol. 11(5), pages 1-19, May.
    14. Kazim Topuz & Behrooz Davazdahemami & Dursun Delen, 2024. "A Bayesian belief network-based analytics methodology for early-stage risk detection of novel diseases," Annals of Operations Research, Springer, vol. 341(1), pages 673-697, October.
    15. Mariangela Sciandra & Antonella Plaia & Vincenza Capursi, 2017. "Classification trees for multivariate ordinal response: an application to Student Evaluation Teaching," Quality & Quantity: International Journal of Methodology, Springer, vol. 51(2), pages 641-655, March.
    16. Augustine Denteh & Helge Liebert, 2022. "Who Increases Emergency Department Use? New Insights from the Oregon Health Insurance Experiment," Working Papers 2201, Tulane University, Department of Economics.
    17. Karolis Matikonis & Matthew Gobey, 2024. "Small Business Property Tax Reductions and Firm Productivity," Small Business Economics, Springer, vol. 62(1), pages 307-324, January.
    18. Matthew Tuson & Berwin Turlach & Kevin Murray & Mei Ruu Kok & Alistair Vickery & David Whyatt, 2021. "Predicting Future Geographic Hotspots of Potentially Preventable Hospitalisations Using All Subset Model Selection and Repeated K-Fold Cross-Validation," IJERPH, MDPI, vol. 18(19), pages 1-21, September.
    19. Keon Lee, Seong, 2005. "On generalized multivariate decision tree by using GEE," Computational Statistics & Data Analysis, Elsevier, vol. 49(4), pages 1105-1119, June.
    20. Susan Athey & Julie Tibshirani & Stefan Wager, 2016. "Generalized Random Forests," Papers 1610.01271, arXiv.org, revised Apr 2018.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:54:y:2010:i:6:p:1516-1524. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.