IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0110268.html
   My bibliography  Save this article

The Dawn of Open Access to Phylogenetic Data

Author

Listed:
  • Andrew F Magee
  • Michael R May
  • Brian R Moore

Abstract

The scientific enterprise depends critically on the preservation of and open access to published data. This basic tenet applies acutely to phylogenies (estimates of evolutionary relationships among species). Increasingly, phylogenies are estimated from increasingly large, genome-scale datasets using increasingly complex statistical methods that require increasing levels of expertise and computational investment. Moreover, the resulting phylogenetic data provide an explicit historical perspective that critically informs research in a vast and growing number of scientific disciplines. One such use is the study of changes in rates of lineage diversification (speciation – extinction) through time. As part of a meta-analysis in this area, we sought to collect phylogenetic data (comprising nucleotide sequence alignment and tree files) from 217 studies published in 46 journals over a 13-year period. We document our attempts to procure those data (from online archives and by direct request to corresponding authors), and report results of analyses (using Bayesian logistic regression) to assess the impact of various factors on the success of our efforts. Overall, complete phylogenetic data for of these studies are effectively lost to science. Our study indicates that phylogenetic data are more likely to be deposited in online archives and/or shared upon request when: (1) the publishing journal has a strong data-sharing policy; (2) the publishing journal has a higher impact factor, and; (3) the data are requested from faculty rather than students. Importantly, our survey spans recent policy initiatives and infrastructural changes; our analyses indicate that the positive impact of these community initiatives has been both dramatic and immediate. Although the results of our study indicate that the situation is dire, our findings also reveal tremendous recent progress in the sharing and preservation of phylogenetic data.

Suggested Citation

  • Andrew F Magee & Michael R May & Brian R Moore, 2014. "The Dawn of Open Access to Phylogenetic Data," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-10, October.
  • Handle: RePEc:plo:pone00:0110268
    DOI: 10.1371/journal.pone.0110268
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0110268
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0110268&type=printable
    Download Restriction: no

    References listed on IDEAS

    as
    1. Heather A Piwowar, 2011. "Who Shares? Who Doesn't? Factors Associated with Openly Archiving Raw Research Data," PLOS ONE, Public Library of Science, vol. 6(7), pages 1-13, July.
    2. Jelte M Wicherts & Marjan Bakker & Dylan Molenaar, 2011. "Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results," PLOS ONE, Public Library of Science, vol. 6(11), pages 1-7, November.
    3. Cédric Notredame, 2007. "Recent Evolutions of Multiple Sequence Alignment Algorithms," PLOS Computational Biology, Public Library of Science, vol. 3(8), pages 1-4, August.
    4. Caroline J Savage & Andrew J Vickers, 2009. "Empirical Study of Data Sharing by Authors Publishing in PLoS Journals," PLOS ONE, Public Library of Science, vol. 4(9), pages 1-3, September.
    5. Alawi A Alsheikh-Ali & Waqas Qureshi & Mouaz H Al-Mallah & John P A Ioannidis, 2011. "Public Availability of Published Research Data in High-Impact Journals," PLOS ONE, Public Library of Science, vol. 6(9), pages 1-4, September.
    6. Heather A Piwowar & Roger S Day & Douglas B Fridsma, 2007. "Sharing Detailed Research Data Is Associated with Increased Citation Rate," PLOS ONE, Public Library of Science, vol. 2(3), pages 1-5, March.
    7. Bryan T Drew & Romina Gazis & Patricia Cabezas & Kristen S Swithers & Jiabin Deng & Roseana Rodriguez & Laura A Katz & Keith A Crandall & David S Hibbett & Douglas E Soltis, 2013. "Lost Branches on the Tree of Life," PLOS Biology, Public Library of Science, vol. 11(9), pages 1-5, September.
    8. Julie D Thompson & Benjamin Linard & Odile Lecompte & Olivier Poch, 2011. "A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future Perspectives," PLOS ONE, Public Library of Science, vol. 6(3), pages 1-14, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mallory C Kidwell & Ljiljana B Lazarević & Erica Baranski & Tom E Hardwicke & Sarah Piechowski & Lina-Sophia Falkenberg & Curtis Kennett & Agnieszka Slowik & Carina Sonnleitner & Chelsey Hess-Holden &, 2016. "Badges to Acknowledge Open Practices: A Simple, Low-Cost, Effective Method for Increasing Transparency," PLOS Biology, Public Library of Science, vol. 14(5), pages 1-15, May.
    2. Simon Robin Evans, 2016. "Gauging the Purported Costs of Public Data Archiving for Long-Term Population Studies," PLOS Biology, Public Library of Science, vol. 14(4), pages 1-9, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Genevieve Pham-Kanter & Darren E Zinner & Eric G Campbell, 2014. "Codifying Collegiality: Recent Developments in Data Sharing Policy in the Life Sciences," PLOS ONE, Public Library of Science, vol. 9(9), pages 1-8, September.
    2. Stefan Stieglitz & Konstantin Wilms & Milad Mirbabaie & Lennart Hofeditz & Bela Brenger & Ania López & Stephanie Rehwald, 2020. "When are researchers willing to share their data? – Impacts of values and uncertainty on open data in academia," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-20, July.
    3. Bryan T Drew & Romina Gazis & Patricia Cabezas & Kristen S Swithers & Jiabin Deng & Roseana Rodriguez & Laura A Katz & Keith A Crandall & David S Hibbett & Douglas E Soltis, 2013. "Lost Branches on the Tree of Life," PLOS Biology, Public Library of Science, vol. 11(9), pages 1-5, September.
    4. Nicola Milia & Alessandra Congiu & Paolo Anagnostou & Francesco Montinaro & Marco Capocasa & Emanuele Sanna & Giovanni Destro Bisol, 2012. "Mine, Yours, Ours? Sharing Data on Human Genetic Variation," PLOS ONE, Public Library of Science, vol. 7(6), pages 1-8, June.
    5. Jillian C Wallis & Elizabeth Rolando & Christine L Borgman, 2013. "If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-17, July.
    6. Marjan Bakker & Jelte M Wicherts, 2014. "Outlier Removal and the Relation with Reporting Errors and Quality of Psychological Research," PLOS ONE, Public Library of Science, vol. 9(7), pages 1-9, July.
    7. Victoria Stodden & Peixuan Guo & Zhaokun Ma, 2013. "Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-8, June.
    8. Zeng, Tong & Wu, Longfeng & Bratt, Sarah & Acuna, Daniel E., 2020. "Assigning credit to scientific datasets using article citation networks," Journal of Informetrics, Elsevier, vol. 14(2).
    9. John Ernest Kratz & Carly Strasser, 2015. "Researcher Perspectives on Publication and Peer Review of Data," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-21, February.
    10. Simon Robin Evans, 2016. "Gauging the Purported Costs of Public Data Archiving for Long-Term Population Studies," PLOS Biology, Public Library of Science, vol. 14(4), pages 1-9, April.
    11. Stephanie B Linek & Benedikt Fecher & Sascha Friesike & Marcel Hebing, 2017. "Data sharing as social dilemma: Influence of the researcher’s personality," PLOS ONE, Public Library of Science, vol. 12(8), pages 1-24, August.
    12. Coosje L S Veldkamp & Michèle B Nuijten & Linda Dominguez-Alvarez & Marcel A L M van Assen & Jelte M Wicherts, 2014. "Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science," PLOS ONE, Public Library of Science, vol. 9(12), pages 1-19, December.
    13. Daniel L Parton & Patrick B Grinaway & Sonya M Hanson & Kyle A Beauchamp & John D Chodera, 2016. "Ensembler: Enabling High-Throughput Molecular Simulations at the Superfamily Scale," PLOS Computational Biology, Public Library of Science, vol. 12(6), pages 1-25, June.
    14. Dominique G Roche & Loeske E B Kruuk & Robert Lanfear & Sandra A Binning, 2015. "Public Data Archiving in Ecology and Evolution: How Well Are We Doing?," PLOS Biology, Public Library of Science, vol. 13(11), pages 1-12, November.
    15. Matteo Colombo & Georgi Duev & Michèle B Nuijten & Jan Sprenger, 2018. "Statistical reporting inconsistencies in experimental philosophy," PLOS ONE, Public Library of Science, vol. 13(4), pages 1-12, April.
    16. Damien M O’Halloran, 2017. "phylo-node: A molecular phylogenetic toolkit using Node.js," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-8, April.
    17. Catrin Tudur Smith & Kerry Dwan & Douglas G Altman & Mike Clarke & Richard Riley & Paula R Williamson, 2014. "Sharing Individual Participant Data from Clinical Trials: An Opinion Survey Regarding the Establishment of a Central Repository," PLOS ONE, Public Library of Science, vol. 9(5), pages 1-8, May.
    18. Jan H. Höffler, 2017. "Replication and Economics Journal Policies," American Economic Review, American Economic Association, vol. 107(5), pages 52-55, May.
    19. Carol Tenopir & Suzie Allard & Kimberly Douglass & Arsev Umur Aydinoglu & Lei Wu & Eleanor Read & Maribeth Manoff & Mike Frame, 2011. "Data Sharing by Scientists: Practices and Perceptions," PLOS ONE, Public Library of Science, vol. 6(6), pages 1-21, June.
    20. Alawi A Alsheikh-Ali & Waqas Qureshi & Mouaz H Al-Mallah & John P A Ioannidis, 2011. "Public Availability of Published Research Data in High-Impact Journals," PLOS ONE, Public Library of Science, vol. 6(9), pages 1-4, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0110268. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (plosone). General contact details of provider: https://journals.plos.org/plosone/ .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.