IDEAS home Printed from https://ideas.repec.org/a/sae/somere/v38y2009i1p197-231.html
   My bibliography  Save this article

How Much Does It Cost?

Author

Listed:
  • Jacques-Antoine Gauthier

    (University of Lausanne, Switzerland, Jacques-Antoine.Gauthier@unil.ch)

  • Eric D. Widmer

    (University of Geneva, Switzerland)

  • Philipp Bucher

    (Swiss Institute of Bioinformatics and Swiss Institute for Experimental Cancer Research, Lausanne Switzerland)

  • Cédric Notredame

    (Centre National de la Recherche Scientifique, Marseille, France, and Centre for Genomic Regulation, Barcelona, Spain)

Abstract

One major methodological problem in analysis of sequence data is the determination of costs from which distances between sequences are derived. Although this problem is currently not optimally dealt with in the social sciences, it has some similarity with problems that have been solved in bioinformatics for three decades. In this article, the authors propose an optimization of substitution and deletion/insertion costs based on computational methods. The authors provide an empirical way of determining costs for cases, frequent in the social sciences, in which theory does not clearly promote one cost scheme over another. Using three distinct data sets, the authors tested the distances and cluster solutions produced by the new cost scheme in comparison with solutions based on cost schemes associated with other research strategies. The proposed method performs well compared with other cost-setting strategies, while it alleviates the justification problem of cost schemes.

Suggested Citation

  • Jacques-Antoine Gauthier & Eric D. Widmer & Philipp Bucher & Cédric Notredame, 2009. "How Much Does It Cost?," Sociological Methods & Research, , vol. 38(1), pages 197-231, August.
  • Handle: RePEc:sae:somere:v:38:y:2009:i:1:p:197-231
    DOI: 10.1177/0049124109342065
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/0049124109342065
    Download Restriction: no

    File URL: https://libkey.io/10.1177/0049124109342065?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Glenn Milligan & Martha Cooper, 1985. "An examination of procedures for determining the number of clusters in a data set," Psychometrika, Springer;The Psychometric Society, vol. 50(2), pages 159-179, June.
    2. J. Hartigan, 1985. "Statistical theory in clustering," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 63-76, December.
    3. H. Bock, 1985. "On some significance tests in cluster analysis," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 77-108, December.
    4. Andrew Abbott, 1990. "A Primer on Sequence Methods," Organization Science, INFORMS, vol. 1(4), pages 375-392, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hanly, Mark & Clarke, Paul & Steele, Fiona, 2016. "Sequence analysis of call record data: exploring the role of different cost settings," LSE Research Online Documents on Economics 64896, London School of Economics and Political Science, LSE Library.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Véronique Cariou & Stéphane Verdun & Emmanuelle Diaz & El Qannari & Evelyne Vigneau, 2009. "Comparison of three hypothesis testing approaches for the selection of the appropriate number of clusters of variables," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 3(3), pages 227-241, December.
    2. Z. Volkovich & Z. Barzily & G.-W. Weber & D. Toledano-Kitai & R. Avros, 2012. "An application of the minimal spanning tree approach to the cluster stability problem," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 20(1), pages 119-139, March.
    3. Silke Aisenbrey & Anette E. Fasang, 2010. "New Life for Old Ideas: The "Second Wave" of Sequence Analysis Bringing the "Course" Back Into the Life Course," Sociological Methods & Research, , vol. 38(3), pages 420-462, February.
    4. Z. Volkovich & D. Toledano-Kitai & G.-W. Weber, 2013. "Self-learning K-means clustering: a global optimization approach," Journal of Global Optimization, Springer, vol. 56(2), pages 219-232, June.
    5. Jeanette Engzell, 2023. "Beyond the stereotype of an intrapreneur: an exploratory study of different intrapreneurs and various corporate conditions," SN Business & Economics, Springer, vol. 3(8), pages 1-24, August.
    6. María Gallegos & Gunter Ritter, 2009. "Trimming algorithms for clustering contaminated grouped data and their robustness," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 3(2), pages 135-167, September.
    7. Pacáková, Z. & Poláčková, J., 2013. "Hierarchical Cluster Analysis – Various Approaches to Data Preparation," AGRIS on-line Papers in Economics and Informatics, Czech University of Life Sciences Prague, Faculty of Economics and Management, vol. 5(3), pages 1-11, September.
    8. Ranjan Maitra & Ivan P. Ramler, 2009. "Clustering in the Presence of Scatter," Biometrics, The International Biometric Society, vol. 65(2), pages 341-352, June.
    9. Gallegos, María Teresa & Ritter, Gunter, 2010. "Using combinatorial optimization in model-based trimmed clustering with cardinality constraints," Computational Statistics & Data Analysis, Elsevier, vol. 54(3), pages 637-654, March.
    10. Liu, Pei-chen Barry & Hansen, Mark & Mukherjee, Avijit, 2008. "Scenario-based air traffic flow management: From theory to practice," Transportation Research Part B: Methodological, Elsevier, vol. 42(7-8), pages 685-702, August.
    11. Li, Pai-Ling & Chiou, Jeng-Min, 2011. "Identifying cluster number for subspace projected functional data clustering," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2090-2103, June.
    12. Alessandra Cepparulo & Antonello Zanfei, 2019. "The diffusion of public eServices in European cities," Working Papers 1904, University of Urbino Carlo Bo, Department of Economics, Society & Politics - Scientific Committee - L. Stefanini & G. Travaglini, revised 2019.
    13. Noelia Caceres & Luis M. Romero & Francisco J. Morales & Antonio Reyes & Francisco G. Benitez, 2018. "Estimating traffic volumes on intercity road locations using roadway attributes, socioeconomic features and other work-related activity characteristics," Transportation, Springer, vol. 45(5), pages 1449-1473, September.
    14. Michele Cincera, 2005. "Firms' productivity growth and R&D spillovers: An analysis of alternative technological proximity measures," Economics of Innovation and New Technology, Taylor & Francis Journals, vol. 14(8), pages 657-682.
    15. Douglas L. Steinley & M. J. Brusco, 2019. "Using an Iterative Reallocation Partitioning Algorithm to Verify Test Multidimensionality," Journal of Classification, Springer;The Classification Society, vol. 36(3), pages 397-413, October.
    16. Devillanova, Carlo & Raitano, Michele & Struffolino, Emanuela, 2019. "Longitudinal employment trajectories and health in middle life: Insights from linked administrative and survey data," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, pages 1375-1412.
    17. Javier Sevil-Serrano & Alberto Aibar-Solana & Ángel Abós & José Antonio Julián & Luis García-González, 2019. "Healthy or Unhealthy? The Cocktail of Health-Related Behavior Profiles in Spanish Adolescents," IJERPH, MDPI, vol. 16(17), pages 1-14, August.
    18. Jack DeWaard & Keuntae Kim & James Raymer, 2012. "Migration Systems in Europe: Evidence From Harmonized Flow Data," Demography, Springer;Population Association of America (PAA), vol. 49(4), pages 1307-1333, November.
    19. Vicente Rodríguez Montequín & Joaquín Villanueva Balsera & Sonia María Cousillas Fernández & Francisco Ortega Fernández, 2018. "Exploring Project Complexity through Project Failure Factors: Analysis of Cluster Patterns Using Self-Organizing Maps," Complexity, Hindawi, vol. 2018, pages 1-17, May.
    20. Goethner, Maximilian & Hornuf, Lars & Regner, Tobias, 2021. "Protecting investors in equity crowdfunding: An empirical analysis of the small investor protection act," Technological Forecasting and Social Change, Elsevier, vol. 162(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:somere:v:38:y:2009:i:1:p:197-231. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.