IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1000770.html
   My bibliography  Save this article

A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies

Author

Listed:
  • Oliver Stegle
  • Leopold Parts
  • Richard Durbin
  • John Winn

Abstract

Gene expression measurements are influenced by a wide range of factors, such as the state of the cell, experimental conditions and variants in the sequence of regulatory regions. To understand the effect of a variable of interest, such as the genotype of a locus, it is important to account for variation that is due to confounding causes. Here, we present VBQTL, a probabilistic approach for mapping expression quantitative trait loci (eQTLs) that jointly models contributions from genotype as well as known and hidden confounding factors. VBQTL is implemented within an efficient and flexible inference framework, making it fast and tractable on large-scale problems. We compare the performance of VBQTL with alternative methods for dealing with confounding variability on eQTL mapping datasets from simulations, yeast, mouse, and human. Employing Bayesian complexity control and joint modelling is shown to result in more precise estimates of the contribution of different confounding factors resulting in additional associations to measured transcript levels compared to alternative approaches. We present a threefold larger collection of cis eQTLs than previously found in a whole-genome eQTL scan of an outbred human population. Altogether, 27% of the tested probes show a significant genetic association in cis, and we validate that the additional eQTLs are likely to be real by replicating them in different sets of individuals. Our method is the next step in the analysis of high-dimensional phenotype data, and its application has revealed insights into genetic regulation of gene expression by demonstrating more abundant cis-acting eQTLs in human than previously shown. Our software is freely available online at http://www.sanger.ac.uk/resources/software/peer/.Author Summary: Gene expression is a complex phenotype. The measured expression level in an experiment can be affected by a wide range of factors—state of the cell, experimental conditions, variants in the sequence of regulatory regions, and others. To understand genotype-to-phenotype relationships, we need to be able to distinguish the variation that is due to the genetic state from all the confounding causes. We present VBQTL, a probabilistic method for dissecting gene expression variation by jointly modelling the underlying global causes of variability and the genetic effect. Our method is implemented in a flexible framework that allows for quick model adaptation and comparison with alternative models. The probabilistic approach yields more accurate estimates of the contributions from different sources of variation. Applying VBQTL, we find that common genetic variation controlling gene expression levels in human is more abundant than previously shown, which has implications for a wide range of studies relating genotype to phenotype.

Suggested Citation

  • Oliver Stegle & Leopold Parts & Richard Durbin & John Winn, 2010. "A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies," PLOS Computational Biology, Public Library of Science, vol. 6(5), pages 1-11, May.
  • Handle: RePEc:plo:pcbi00:1000770
    DOI: 10.1371/journal.pcbi.1000770
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000770
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1000770&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1000770?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yanqing Chen & Jun Zhu & Pek Yee Lum & Xia Yang & Shirly Pinto & Douglas J. MacNeil & Chunsheng Zhang & John Lamb & Stephen Edwards & Solveig K. Sieberts & Amy Leonardson & Lawrence W. Castellini & Su, 2008. "Variations in DNA elucidate molecular networks that cause disease," Nature, Nature, vol. 452(7186), pages 429-435, March.
    2. Jeffrey T Leek & John D Storey, 2007. "Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis," PLOS Genetics, Public Library of Science, vol. 3(9), pages 1-12, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Satria P. Sajuthi & Jamie L. Everman & Nathan D. Jackson & Benjamin Saef & Cydney L. Rios & Camille M. Moore & Angel C. Y. Mak & Celeste Eng & Ana Fairbanks-Mahnke & Sandra Salazar & Jennifer Elhawary, 2022. "Nasal airway transcriptome-wide association study of asthma reveals genetically driven mucus pathobiology," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    2. Seong Kyu Han & Michelle T. McNulty & Christopher J. Benway & Pei Wen & Anya Greenberg & Ana C. Onuchic-Whitford & Dongkeun Jang & Jason Flannick & Noël P. Burtt & Parker C. Wilson & Benjamin D. Humph, 2023. "Mapping genomic regulation of kidney disease and traits through high-resolution and interpretable eQTLs," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    3. M. A. Zouache & B. T. Richards & C. M. Pappas & R. A. Anstadt & J. Liu & T. Corsetti & S. Matthews & N. A. Seager & S. Schmitz-Valckenberg & M. Fleckenstein & W. C. Hubbard & J. Thomas & J. L. Hageman, 2024. "Levels of complement factor H-related 4 protein do not influence susceptibility to age-related macular degeneration or its course of progression," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    4. Ryo Yamamoto & Ryan Chung & Juan Manuel Vazquez & Huanjie Sheng & Philippa L. Steinberg & Nilah M. Ioannidis & Peter H. Sudmant, 2022. "Tissue-specific impacts of aging and genetics on gene expression patterns in humans," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    5. Nicoló Fusi & Oliver Stegle & Neil D Lawrence, 2012. "Joint Modelling of Confounding Factors and Prominent Genetic Regulators Provides Increased Accuracy in Genetical Genomics Studies," PLOS Computational Biology, Public Library of Science, vol. 8(1), pages 1-9, January.
    6. Barbara E Stranger & Stephen B Montgomery & Antigone S Dimas & Leopold Parts & Oliver Stegle & Catherine E Ingle & Magda Sekowska & George Davey Smith & David Evans & Maria Gutierrez-Arcelus & Alkes P, 2012. "Patterns of Cis Regulatory Variation in Diverse Human Populations," PLOS Genetics, Public Library of Science, vol. 8(4), pages 1-13, April.
    7. Chuan Gao & Ian C McDowell & Shiwen Zhao & Christopher D Brown & Barbara E Engelhardt, 2016. "Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-39, July.
    8. Jin Hyun Ju & Sushila A Shenoy & Ronald G Crystal & Jason G Mezey, 2017. "An independent component analysis confounding factor correction framework for identifying broad impact expression quantitative trait loci," PLOS Computational Biology, Public Library of Science, vol. 13(5), pages 1-26, May.
    9. Nikolaos Ignatiadis & Wolfgang Huber, 2021. "Covariate powered cross‐weighted multiple testing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(4), pages 720-751, September.
    10. Yu Yan & Hongbo Liu & Amin Abedini & Xin Sheng & Matthew Palmer & Hongzhe Li & Katalin Susztak, 2024. "Unraveling the epigenetic code: human kidney DNA methylation and chromatin dynamics in renal disease development," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    11. Sébastien Thériault & Zhonglin Li & Erik Abner & Jian’an Luan & Hasanga D. Manikpurage & Ursula Houessou & Pardis Zamani & Mewen Briend & Dominique K. Boudreau & Nathalie Gaudreault & Lily Frenette & , 2024. "Integrative genomic analyses identify candidate causal genes for calcific aortic valve stenosis involving tissue-specific regulation," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    12. Kaido Lepik & Tarmo Annilo & Viktorija Kukuškina & eQTLGen Consortium & Kai Kisand & Zoltán Kutalik & Pärt Peterson & Hedi Peterson, 2017. "C-reactive protein upregulates the whole blood expression of CD59 - an integrative analysis," PLOS Computational Biology, Public Library of Science, vol. 13(9), pages 1-20, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Won Jun Lee & Sang Cheol Kim & Jung-Ho Yoon & Sang Jun Yoon & Johan Lim & You-Sun Kim & Sung Won Kwon & Jeong Hill Park, 2016. "Meta-Analysis of Tumor Stem-Like Breast Cancer Cells Using Gene Set and Network Analysis," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-20, February.
    2. Jin Hyun Ju & Sushila A Shenoy & Ronald G Crystal & Jason G Mezey, 2017. "An independent component analysis confounding factor correction framework for identifying broad impact expression quantitative trait loci," PLOS Computational Biology, Public Library of Science, vol. 13(5), pages 1-26, May.
    3. Arjun Bhattacharya & Anastasia N. Freedman & Vennela Avula & Rebeca Harris & Weifang Liu & Calvin Pan & Aldons J. Lusis & Robert M. Joseph & Lisa Smeester & Hadley J. Hartwell & Karl C. K. Kuban & Car, 2022. "Placental genomics mediates genetic associations with complex health traits and disease," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    4. repec:jss:jstsof:40:i14 is not listed on IDEAS
    5. Emma Pierson & the GTEx Consortium & Daphne Koller & Alexis Battle & Sara Mostafavi, 2015. "Sharing and Specificity of Co-expression Networks across 35 Human Tissues," PLOS Computational Biology, Public Library of Science, vol. 11(5), pages 1-19, May.
    6. Kai Wang & Manikandan Narayanan & Hua Zhong & Martin Tompa & Eric E Schadt & Jun Zhu, 2009. "Meta-analysis of Inter-species Liver Co-expression Networks Elucidates Traits Associated with Common Human Diseases," PLOS Computational Biology, Public Library of Science, vol. 5(12), pages 1-16, December.
    7. Valur Emilsson & Elias F. Gudmundsson & Thorarinn Jonmundsson & Brynjolfur G. Jonsson & Michael Twarog & Valborg Gudmundsdottir & Zhiguang Li & Nancy Finkel & Stephen Poor & Xin Liu & Robert Esterberg, 2022. "A proteogenomic signature of age-related macular degeneration in blood," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    8. Emanuele Aliverti & Kristian Lum & James E. Johndrow & David B. Dunson, 2021. "Removing the influence of group variables in high‐dimensional predictive modelling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(3), pages 791-811, July.
    9. Marron, J.S., 2017. "Big Data in context and robustness against heterogeneity," Econometrics and Statistics, Elsevier, vol. 2(C), pages 73-80.
    10. Seungchul Baek & Yen‐Yi Ho & Yanyuan Ma, 2020. "Using sufficient direction factor model to analyze latent activities associated with breast cancer survival," Biometrics, The International Biometric Society, vol. 76(4), pages 1340-1350, December.
    11. Griffin, Maryclare & Hoff, Peter D., 2019. "Lasso ANOVA decompositions for matrix and tensor data," Computational Statistics & Data Analysis, Elsevier, vol. 137(C), pages 181-194.
    12. Yunfeng Li & Jarrett Morrow & Benjamin Raby & Kelan Tantisira & Scott T Weiss & Wei Huang & Weiliang Qiu, 2017. "Detecting disease-associated genomic outcomes using constrained mixture of Bayesian hierarchical models for paired data," PLOS ONE, Public Library of Science, vol. 12(3), pages 1-16, March.
    13. Benjamin A Logsdon & Jason Mezey, 2010. "Gene Expression Network Reconstruction by Convex Feature Selection when Incorporating Genetic Perturbations," PLOS Computational Biology, Public Library of Science, vol. 6(12), pages 1-13, December.
    14. Zhaohui Qin & Ben Li & Karen N. Conneely & Hao Wu & Ming Hu & Deepak Ayyala & Yongseok Park & Victor X. Jin & Fangyuan Zhang & Han Zhang & Li Li & Shili Lin, 2016. "Statistical Challenges in Analyzing Methylation and Long-Range Chromosomal Interaction Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 8(2), pages 284-309, October.
    15. Zemin Zheng & Jinchi Lv & Wei Lin, 2021. "Nonsparse Learning with Latent Variables," Operations Research, INFORMS, vol. 69(1), pages 346-359, January.
    16. Chee Ho H’ng & Shanika L. Amarasinghe & Boya Zhang & Hojin Chang & Xinli Qu & David R. Powell & Alberto Rosello-Diez, 2024. "Compensatory growth and recovery of cartilage cytoarchitecture after transient cell death in fetal mouse limbs," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    17. Mark Reimers, 2010. "Making Informed Choices about Microarray Data Analysis," PLOS Computational Biology, Public Library of Science, vol. 6(5), pages 1-7, May.
    18. Leek Jeffrey T & Storey John D., 2011. "The Joint Null Criterion for Multiple Hypothesis Tests," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-22, June.
    19. Nicoló Fusi & Oliver Stegle & Neil D Lawrence, 2012. "Joint Modelling of Confounding Factors and Prominent Genetic Regulators Provides Increased Accuracy in Genetical Genomics Studies," PLOS Computational Biology, Public Library of Science, vol. 8(1), pages 1-9, January.
    20. Miecznikowski, Jeffrey C. & Gold, David & Shepherd, Lori & Liu, Song, 2011. "Deriving and comparing the distribution for the number of false positives in single step methods to control k-FWER," Statistics & Probability Letters, Elsevier, vol. 81(11), pages 1695-1705, November.
    21. Aline Talhouk & Stefan Kommoss & Robertson Mackenzie & Martin Cheung & Samuel Leung & Derek S Chiu & Steve E Kalloger & David G Huntsman & Stephanie Chen & Maria Intermaggio & Jacek Gronwald & Fong C , 2016. "Single-Patient Molecular Testing with NanoString nCounter Data Using a Reference-Based Strategy for Batch Effect Correction," PLOS ONE, Public Library of Science, vol. 11(4), pages 1-18, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1000770. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.