IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004867.html
   My bibliography  Save this article

An Introduction to Programming for Bioscientists: A Python-Based Primer

Author

Listed:
  • Berk Ekmekci
  • Charles E McAnany
  • Cameron Mura

Abstract

Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in molecular biology, biochemistry, and other biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language’s usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a “variable,” the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.Author Summary: Contemporary biology has largely become computational biology, whether it involves applying physical principles to simulate the motion of each atom in a piece of DNA, or using machine learning algorithms to integrate and mine “omics” data across whole cells (or even entire ecosystems). The ability to design algorithms and program computers, even at a novice level, may be the most indispensable skill that a modern researcher can cultivate. As with human languages, computational fluency is developed actively, not passively. This self-contained text, structured as a hybrid primer/tutorial, introduces any biologist—from college freshman to established senior scientist—to basic computing principles (control-flow, recursion, regular expressions, etc.) and the practicalities of programming and software design. We use the Python language because it now pervades virtually every domain of the biosciences, from sequence-based bioinformatics and molecular evolution to phylogenomics, systems biology, structural biology, and beyond. To introduce both coding (in general) and Python (in particular), we guide the reader via concrete examples and exercises. We also supply, as Supplemental Chapters, a few thousand lines of heavily-annotated, freely distributed source code for personal study.

Suggested Citation

  • Berk Ekmekci & Charles E McAnany & Cameron Mura, 2016. "An Introduction to Programming for Bioscientists: A Python-Based Primer," PLOS Computational Biology, Public Library of Science, vol. 12(6), pages 1-43, June.
  • Handle: RePEc:plo:pcbi00:1004867
    DOI: 10.1371/journal.pcbi.1004867
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004867
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004867&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004867?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Amir Rubinstein & Benny Chor, 2014. "Computational Thinking in Life Science Education," PLOS Computational Biology, Public Library of Science, vol. 10(11), pages 1-5, November.
    2. Tsang, Eric W. K., 2014. "Old and New," Management and Organization Review, Cambridge University Press, vol. 10(03), pages 390-390, November.
    3. Nick Barnes, 2010. "Publish your computer code: it is good enough," Nature, Nature, vol. 467(7317), pages 753-753, October.
    4. Xia, Xiao-Qin & McClelland, Michael & Wang, Yipeng, 2010. "PypeR, A Python Package for Using R in Python," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 35(c02).
    5. Lonnie Welch & Fran Lewitter & Russell Schwartz & Cath Brooksbank & Predrag Radivojac & Bruno Gaeta & Maria Victoria Schneider, 2014. "Bioinformatics Curriculum Guidelines: Toward a Definition of Core Competencies," PLOS Computational Biology, Public Library of Science, vol. 10(3), pages 1-10, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Cameron Mura & Mike Chalupa & Abigail M Newbury & Jack Chalupa & Philip E Bourne, 2020. "Ten simple rules for starting research in your late teens," PLOS Computational Biology, Public Library of Science, vol. 16(11), pages 1-11, November.
    2. Richard A Erickson & Michael N Fienen & S Grace McCalla & Emily L Weiser & Melvin L Bower & Jonathan M Knudson & Greg Thain, 2018. "Wrangling distributed computing for high-throughput environmental science: An introduction to HTCondor," PLOS Computational Biology, Public Library of Science, vol. 14(10), pages 1-8, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ali Shalizar Jalali, 2018. "Male Fertility as a Bull’s Eye for Mastocytosis," Global Journal of Reproductive Medicine, Juniper Publishers Inc., vol. 3(3), pages 58-60, February.
    2. Hui Yan & Guixiang Liu, 2021. "Fire’s Effects on Grassland Restoration and Biodiversity Conservation," Sustainability, MDPI, vol. 13(21), pages 1-15, October.
    3. Michal Plaček & Martin Schmidt & František Ochrana & Michal Půček, 2017. "Do the Selected Characteristics of Public Tenders Affect the Likelihood of Filing Petitions with the Regulators of Public Tenders?," Prague Economic Papers, Prague University of Economics and Business, vol. 2017(3), pages 317-329.
    4. Nikolov, Plamen & Adelman, Alan, 2019. "Do private household transfers to the elderly respond to public pension benefits? Evidence from rural China," The Journal of the Economics of Ageing, Elsevier, vol. 14(C).
    5. Dana Benešová & Viera Kubičková & Miroslava Prváková, 2020. "Open innovation model in the knowledge intensive business services in the Slovak Republic," Entrepreneurship and Sustainability Issues, VsI Entrepreneurship and Sustainability Center, vol. 8(2), pages 1340-1358, December.
    6. Holzmann, Robert & Alonso-García, Jennifer & Labit-Hardy, Heloise & Villegas, Andres M., 2017. "NDC Schemes and Heterogeneity in Longevity: Proposals for Redesign," IZA Discussion Papers 11193, Institute of Labor Economics (IZA).
    7. Selman, P., 2014. "Intercountry Adoption Agencies and the HCIA," ISS Working Papers - General Series 77404, International Institute of Social Studies of Erasmus University Rotterdam (ISS), The Hague.
    8. Martinho, Vítor João Pereira Domingues, 2019. "Historical records of wine: Highlighting the old wine world," EconStor Preprints 193461, ZBW - Leibniz Information Centre for Economics.
    9. Gabriella Garbarino & Giovanni Pampararo & Thanh Khoa Phung & Paola Riani & Guido Busca, 2020. "Heterogeneous Catalysis in (Bio)Ethanol Conversion to Chemicals and Fuels: Thermodynamics, Catalysis, Reaction Paths, Mechanisms and Product Selectivities," Energies, MDPI, vol. 13(14), pages 1-19, July.
    10. Zhongcheng Yan & Feng Wei & Xin Deng & Chuan Li & Qiang He & Yanbin Qi, 2022. "Feminization of Agriculture: Do Female Farmers Have Higher Expectations for the Value of Their Farmland?—Empirical Evidence from China," Agriculture, MDPI, vol. 12(1), pages 1-22, January.
    11. Hélène Laurell & Leona Achtenhagen & Svante Andersson, 2017. "The changing role of network ties and critical capabilities in an international new venture’s early development," International Entrepreneurship and Management Journal, Springer, vol. 13(1), pages 113-140, March.
    12. Trine Filges & Anu Siren & Torben Fridberg & Bjørn C. V. Nielsen, 2020. "Voluntary work for the physical and mental health of older volunteers: A systematic review," Campbell Systematic Reviews, John Wiley & Sons, vol. 16(4), December.
    13. Alexandru-Ionuţ Petrişor & Walid Hamma & Huu Duy Nguyen & Giovanni Randazzo & Anselme Muzirafuti & Mari-Isabella Stan & Van Truong Tran & Roxana Aştefănoaiei & Quang-Thanh Bui & Dragoş-Florian Vintilă, 2020. "Degradation of Coastlines under the Pressure of Urbanization and Tourism: Evidence on the Change of Land Systems from Europe, Asia and Africa," Land, MDPI, vol. 9(8), pages 1-43, August.
    14. repec:ers:journl:v:special_issue:y:2018:i:1:p:466-478 is not listed on IDEAS
    15. Sellami Sana & Verhaest Dieter & Nonneman Walter & Van Trier Walter, 2017. "The Impact of Educational Mismatches on Wages: The Influence of Measurement Error and Unobserved Heterogeneity," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 17(1), pages 1-20, February.
    16. Xavier Gabaix & Jean‐Michel Lasry & Pierre‐Louis Lions & Benjamin Moll, 2016. "The Dynamics of Inequality," Econometrica, Econometric Society, vol. 84, pages 2071-2111, November.
    17. Kenneth M. Johnson & Daniel T. Lichter, 2016. "Diverging Demography: Hispanic and Non-Hispanic Contributions to U.S. Population Redistribution and Diversity," Population Research and Policy Review, Springer;Southern Demographic Association (SDA), vol. 35(5), pages 705-725, October.
    18. Su, Guifu & Tu, Jianhua & Das, Kinkar Ch., 2015. "Graphs with fixed number of pendent vertices and minimal Zeroth-order general Randić index," Applied Mathematics and Computation, Elsevier, vol. 270(C), pages 705-710.
    19. Zbigniew Drewniak & Rafal Drewniak & Robert Karaszewski, 2020. "The Assessment of the Features of Inter-organisational Relationships: Benefits, Duration, Repeatability and Maturity of the Relationship with the Company's Stakeholders," European Research Studies Journal, European Research Studies Journal, vol. 0(Special 1), pages 443-461.
    20. Tanja Lepistö & Tiina Mäkitalo-Keinonen & Tiina Valjakka, 0. "Opportunity recognition in a hub-governed network – insights from garage services," International Entrepreneurship and Management Journal, Springer, vol. 0, pages 1-24.
    21. Markus Mykk�nen & Neil Freshwater, 2021. "Typology of think tanks: A comparative study in Finland and Scotland," Academicus International Scientific Journal, Entrepreneurship Training Center Albania, issue 23, pages 72-90, January.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004867. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.