IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0024982.html
   My bibliography  Save this article

SNPpy - Database Management for SNP Data from Genome Wide Association Studies

Author

Listed:
  • Faheem Mitha
  • Herodotos Herodotou
  • Nedyalko Borisov
  • Chen Jiang
  • Josh Yoder
  • Kouros Owzar

Abstract

Background: We describe SNPpy, a hybrid script database system using the Python SQLAlchemy library coupled with the PostgreSQL database to manage genotype data from Genome-Wide Association Studies (GWAS). This system makes it possible to merge study data with HapMap data and merge across studies for meta-analyses, including data filtering based on the values of phenotype and Single-Nucleotide Polymorphism (SNP) data. SNPpy and its dependencies are open source software. Results: The current version of SNPpy offers utility functions to import genotype and annotation data from two commercial platforms. We use these to import data from two GWAS studies and the HapMap Project. We then export these individual datasets to standard data format files that can be imported into statistical software for downstream analyses. Conclusions: By leveraging the power of relational databases, SNPpy offers integrated management and manipulation of genotype and phenotype data from GWAS studies. The analysis of these studies requires merging across GWAS datasets as well as patient and marker selection. To this end, SNPpy enables the user to filter the data and output the results as standardized GWAS file formats. It does low level and flexible data validation, including validation of patient data. SNPpy is a practical and extensible solution for investigators who seek to deploy central management of their GWAS data.

Suggested Citation

  • Faheem Mitha & Herodotos Herodotou & Nedyalko Borisov & Chen Jiang & Josh Yoder & Kouros Owzar, 2011. "SNPpy - Database Management for SNP Data from Genome Wide Association Studies," PLOS ONE, Public Library of Science, vol. 6(10), pages 1-8, October.
  • Handle: RePEc:plo:pone00:0024982
    DOI: 10.1371/journal.pone.0024982
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0024982
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0024982&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0024982?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Bryan N Howie & Peter Donnelly & Jonathan Marchini, 2009. "A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies," PLOS Genetics, Public Library of Science, vol. 5(6), pages 1-15, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yikun Zhao & Bin Jiang & Yongxue Huo & Hongmei Yi & Hongli Tian & Haotian Wu & Rui Wang & Jiuran Zhao & Fengge Wang, 2021. "A High-Performance Database Management System for Managing and Analyzing Large-Scale SNP Data in Plant Genotyping and Breeding Applications," Agriculture, MDPI, vol. 11(11), pages 1-21, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chuan Gao & Nan Wang & Xiuqing Guo & Julie T Ziegler & Kent D Taylor & Anny H Xiang & Yang Hai & Steven J Kridel & Jerry L Nadler & Fouad Kandeel & Leslie J Raffel & Yii-Der I Chen & Jill M Norris & J, 2015. "A Comprehensive Analysis of Common and Rare Variants to Identify Adiposity Loci in Hispanic Americans: The IRAS Family Study (IRASFS)," PLOS ONE, Public Library of Science, vol. 10(11), pages 1-17, November.
    2. Rakesh Chettier & Lesa Nelson & James W Ogilvie & Hans M Albertsen & Kenneth Ward, 2015. "Haplotypes at LBX1 Have Distinct Inheritance Patterns with Opposite Effects in Adolescent Idiopathic Scoliosis," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-11, February.
    3. Michel S. Naslavsky & Marilia O. Scliar & Guilherme L. Yamamoto & Jaqueline Yu Ting Wang & Stepanka Zverinova & Tatiana Karp & Kelly Nunes & José Ricardo Magliocco Ceroni & Diego Lima Carvalho & Carlo, 2022. "Whole-genome sequencing of 1,171 elderly admixed individuals from Brazil," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    4. E P A van Iperen & G K Hovingh & F W Asselbergs & A H Zwinderman, 2017. "Extending the use of GWAS data by combining data from different genetic platforms," PLOS ONE, Public Library of Science, vol. 12(2), pages 1-11, February.
    5. Carl Nettelblad, 2013. "Breakdown of Methods for Phasing and Imputation in the Presence of Double Genotype Sharing," PLOS ONE, Public Library of Science, vol. 8(3), pages 1-5, March.
    6. Joseph Vijai & Tomas Kirchhoff & Kasmintan A Schrader & Jennifer Brown & Ana Virginia Dutra-Clarke & Christopher Manschreck & Nichole Hansen & Rohini Rau-Murthy & Kara Sarrel & Jennifer Przybylo & Soh, 2013. "Susceptibility Loci Associated with Specific and Shared Subtypes of Lymphoid Malignancies," PLOS Genetics, Public Library of Science, vol. 9(1), pages 1-11, January.
    7. Viinikainen, Jutta & Bryson, Alex & Böckerman, Petri & Kari, Jaana T. & Lehtimäki, Terho & Raitakari, Olli & Viikari, Jorma & Pehkonen, Jaakko, 2022. "Does better education mitigate risky health behavior? A mendelian randomization study," Economics & Human Biology, Elsevier, vol. 46(C).
    8. Morten Dybdahl Krebs & Gonçalo Espregueira Themudo & Michael Eriksen Benros & Ole Mors & Anders D. Børglum & David Hougaard & Preben Bo Mortensen & Merete Nordentoft & Michael J. Gandal & Chun Chieh F, 2021. "Associations between patterns in comorbid diagnostic trajectories of individuals with schizophrenia and etiological factors," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    9. Mette K Andersen & Emil Jørsboe & Line Skotte & Kristian Hanghøj & Camilla H Sandholt & Ida Moltke & Niels Grarup & Timo Kern & Yuvaraj Mahendran & Bolette Søborg & Peter Bjerregaard & Christina V L L, 2020. "The derived allele of a novel intergenic variant at chromosome 11 associates with lower body mass index and a favorable metabolic phenotype in Greenlanders," PLOS Genetics, Public Library of Science, vol. 16(1), pages 1-17, January.
    10. Hans M Albertsen & Rakesh Chettier & Pamela Farrington & Kenneth Ward, 2013. "Genome-Wide Association Study Link Novel Loci to Endometriosis," PLOS ONE, Public Library of Science, vol. 8(3), pages 1-8, March.
    11. Qingqin S Li & Antonio R Parrado & Mahesh N Samtani & Vaibhav A Narayan & Alzheimer’s Disease Neuroimaging Initiative, 2015. "Variations in the FRA10AC1 Fragile Site and 15q21 Are Associated with Cerebrospinal Fluid Aβ1-42 Level," PLOS ONE, Public Library of Science, vol. 10(8), pages 1-17, August.
    12. Peng Chen & Rick Twee-Hee Ong & Wan-Ting Tay & Xueling Sim & Mohammad Ali & Haiyan Xu & Chen Suo & Jianjun Liu & Kee-Seng Chia & Eranga Vithana & Terri L Young & Tin Aung & Wei-Yen Lim & Chiea-Chuen K, 2013. "A Study Assessing the Association of Glycated Hemoglobin A1C (HbA1C) Associated Variants with HbA1C, Chronic Kidney Disease and Diabetic Retinopathy in Populations of Asian Ancestry," PLOS ONE, Public Library of Science, vol. 8(11), pages 1-1, November.
    13. Markus Draaken & Michael Knapp & Tracie Pennimpede & Johanna M Schmidt & Anne-Karolin Ebert & Wolfgang Rösch & Raimund Stein & Boris Utsch & Karin Hirsch & Thomas M Boemers & Elisabeth Mangold & Stefa, 2015. "Genome-wide Association Study and Meta-Analysis Identify ISL1 as Genome-wide Significant Susceptibility Gene for Bladder Exstrophy," PLOS Genetics, Public Library of Science, vol. 11(3), pages 1-13, March.
    14. Sara L Van Driest & Tracy L McGregor & Digna R Velez Edwards & Ben R Saville & Terrie E Kitchner & Scott J Hebbring & Murray Brilliant & Hayan Jouni & Iftikhar J Kullo & C Buddy Creech & Prince J Kann, 2015. "Genome-Wide Association Study of Serum Creatinine Levels during Vancomycin Therapy," PLOS ONE, Public Library of Science, vol. 10(6), pages 1-14, June.
    15. Giuseppe Matullo & Simonetta Guarrera & Marta Betti & Giovanni Fiorito & Daniela Ferrante & Floriana Voglino & Gemma Cadby & Cornelia Di Gaetano & Fabio Rosa & Alessia Russo & Ari Hirvonen & Elisabett, 2013. "Genetic Variants Associated with Increased Risk of Malignant Pleural Mesothelioma: A Genome-Wide Association Study," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-11, April.
    16. Myoung Keun Lee & John R Shaffer & Elizabeth J Leslie & Ekaterina Orlova & Jenna C Carlson & Eleanor Feingold & Mary L Marazita & Seth M Weinberg, 2017. "Genome-wide association study of facial morphology reveals novel associations with FREM1 and PARK2," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-13, April.
    17. Anu Loukola & Jadwiga Buchwald & Richa Gupta & Teemu Palviainen & Jenni Hällfors & Emmi Tikkanen & Tellervo Korhonen & Miina Ollikainen & Antti-Pekka Sarin & Samuli Ripatti & Terho Lehtimäki & Olli Ra, 2015. "A Genome-Wide Association Study of a Biomarker of Nicotine Metabolism," PLOS Genetics, Public Library of Science, vol. 11(9), pages 1-23, September.
    18. Taru Tukiainen & Matti Pirinen & Antti-Pekka Sarin & Claes Ladenvall & Johannes Kettunen & Terho Lehtimäki & Marja-Liisa Lokki & Markus Perola & Juha Sinisalo & Efthymia Vlachopoulou & Johan G Eriksso, 2014. "Chromosome X-Wide Association Study Identifies Loci for Fasting Insulin and Height and Evidence for Incomplete Dosage Compensation," PLOS Genetics, Public Library of Science, vol. 10(2), pages 1-12, February.
    19. Wei-Yu Lin & Ian W Brock & Dan Connley & Helen Cramp & Rachel Tucker & Jon Slate & Malcolm W R Reed & Sabapathy P Balasubramanian & Lisa A Cannon-Albright & Nicola J Camp & Angela Cox, 2013. "Associations of ATR and CHEK1 Single Nucleotide Polymorphisms with Breast Cancer," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-1, July.
    20. Harriëtte Riese & Loretto M Muñoz & Catharina A Hartman & Xiuhua Ding & Shaoyong Su & Albertine J Oldehinkel & Arie M van Roon & Peter J van der Most & Joop Lefrandt & Ron T Gansevoort & Pim van der H, 2014. "Identifying Genetic Variants for Heart Rate Variability in the Acetylcholine Pathway," PLOS ONE, Public Library of Science, vol. 9(11), pages 1-9, November.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0024982. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.