IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0049445.html
   My bibliography  Save this article

Finite Adaptation and Multistep Moves in the Metropolis-Hastings Algorithm for Variable Selection in Genome-Wide Association Analysis

Author

Listed:
  • Tomi Peltola
  • Pekka Marttinen
  • Aki Vehtari

Abstract

High-dimensional datasets with large amounts of redundant information are nowadays available for hypothesis-free exploration of scientific questions. A particular case is genome-wide association analysis, where variations in the genome are searched for effects on disease or other traits. Bayesian variable selection has been demonstrated as a possible analysis approach, which can account for the multifactorial nature of the genetic effects in a linear regression model. Yet, the computation presents a challenge and application to large-scale data is not routine. Here, we study aspects of the computation using the Metropolis-Hastings algorithm for the variable selection: finite adaptation of the proposal distributions, multistep moves for changing the inclusion state of multiple variables in a single proposal and multistep move size adaptation. We also experiment with a delayed rejection step for the multistep moves. Results on simulated and real data show increase in the sampling efficiency. We also demonstrate that with application specific proposals, the approach can overcome a specific mixing problem in real data with 3822 individuals and 1,051,811 single nucleotide polymorphisms and uncover a variant pair with synergistic effect on the studied trait. Moreover, we illustrate multimodality in the real dataset related to a restrictive prior distribution on the genetic effect sizes and advocate a more flexible alternative.

Suggested Citation

  • Tomi Peltola & Pekka Marttinen & Aki Vehtari, 2012. "Finite Adaptation and Multistep Moves in the Metropolis-Hastings Algorithm for Variable Selection in Genome-Wide Association Analysis," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-11, November.
  • Handle: RePEc:plo:pone00:0049445
    DOI: 10.1371/journal.pone.0049445
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0049445
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0049445&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0049445?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Clive J Hoggart & John C Whittaker & Maria De Iorio & David J Balding, 2008. "Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies," PLOS Genetics, Public Library of Science, vol. 4(7), pages 1-8, July.
    2. Antonietta Mira, 2001. "On Metropolis-Hastings algorithms with delayed rejection," Metron - International Journal of Statistics, Dipartimento di Statistica, Probabilità e Statistiche Applicate - University of Rome, vol. 0(3-4), pages 231-241.
    3. David J. Nott & Robert Kohn, 2005. "Adaptive sampling for Bayesian variable selection," Biometrika, Biometrika Trust, vol. 92(4), pages 747-763, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dimitris Korobilis & Kenichi Shimizu, 2022. "Bayesian Approaches to Shrinkage and Sparse Estimation," Foundations and Trends(R) in Econometrics, now publishers, vol. 11(4), pages 230-354, June.
    2. Michalis K. Titsias & Christopher Yau, 2017. "The Hamming Ball Sampler," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1598-1611, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Frommlet, Florian & Ruhaltinger, Felix & Twaróg, Piotr & Bogdan, Małgorzata, 2012. "Modified versions of Bayesian Information Criterion for genome-wide association studies," Computational Statistics & Data Analysis, Elsevier, vol. 56(5), pages 1038-1051.
    2. Li, Feng & Kang, Yanfei, 2018. "Improving forecasting performance using covariate-dependent copula models," International Journal of Forecasting, Elsevier, vol. 34(3), pages 456-476.
    3. Ahmed Ismaïl & Hartikainen Anna-Liisa & Järvelin Marjo-Riitta & Richardson Sylvia, 2011. "False Discovery Rate Estimation for Stability Selection: Application to Genome-Wide Association Studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-20, November.
    4. Szefer Elena & Graham Jinko & Lu Donghuan & Beg Mirza Faisal & Nathoo Farouk, 2017. "Multivariate association between single-nucleotide polymorphisms in Alzgene linkage regions and structural changes in the brain: discovery, refinement and validation," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 16(5-6), pages 349-365, December.
    5. Ley, Eduardo & Steel, Mark F.J., 2012. "Mixtures of g-priors for Bayesian model averaging with economic applications," Journal of Econometrics, Elsevier, vol. 171(2), pages 251-266.
    6. Ley, Eduardo & Steel, Mark F. J., 2007. "On the effect of prior assumptions in Bayesian model averaging with applications to growth regression," Policy Research Working Paper Series 4238, The World Bank.
    7. Pasanisi, Alberto & Fu, Shuai & Bousquet, Nicolas, 2012. "Estimating discrete Markov models from various incomplete data schemes," Computational Statistics & Data Analysis, Elsevier, vol. 56(9), pages 2609-2625.
    8. Gael M. Martin & David T. Frazier & Christian P. Robert, 2020. "Computing Bayes: Bayesian Computation from 1763 to the 21st Century," Monash Econometrics and Business Statistics Working Papers 14/20, Monash University, Department of Econometrics and Business Statistics.
    9. Manabu Asai & Michael McAleer, 2022. "Bayesian Analysis of Realized Matrix-Exponential GARCH Models," Computational Economics, Springer;Society for Computational Economics, vol. 59(1), pages 103-123, January.
    10. Villani, Mattias & Kohn, Robert & Giordani, Paolo, 2009. "Regression density estimation using smooth adaptive Gaussian mixtures," Journal of Econometrics, Elsevier, vol. 153(2), pages 155-173, December.
    11. Gael M. Martin & David T. Frazier & Ruben Loaiza-Maya & Florian Huber & Gary Koop & John Maheu & Didier Nibbering & Anastasios Panagiotelis, 2023. "Bayesian Forecasting in the 21st Century: A Modern Review," Monash Econometrics and Business Statistics Working Papers 1/23, Monash University, Department of Econometrics and Business Statistics.
    12. Mark F. J. Steel, 2020. "Model Averaging and Its Use in Economics," Journal of Economic Literature, American Economic Association, vol. 58(3), pages 644-719, September.
    13. Lee Anthony & Caron Francois & Doucet Arnaud & Holmes Chris, 2012. "Bayesian Sparsity-Path-Analysis of Genetic Association Signal using Generalized t Priors," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(2), pages 1-31, January.
    14. Ishihara, Tsunehiro & Omori, Yasuhiro & Asai, Manabu, 2016. "Matrix exponential stochastic volatility with cross leverage," Computational Statistics & Data Analysis, Elsevier, vol. 100(C), pages 331-350.
    15. Zhang, Xibin & King, Maxwell L. & Shang, Han Lin, 2014. "A sampling algorithm for bandwidth estimation in a nonparametric regression model with a flexible error density," Computational Statistics & Data Analysis, Elsevier, vol. 78(C), pages 218-234.
    16. Hai-Yan Lü & Xiao-Fen Liu & Shi-Ping Wei & Yuan-Ming Zhang, 2011. "Epistatic Association Mapping in Homozygous Crop Cultivars," PLOS ONE, Public Library of Science, vol. 6(3), pages 1-10, March.
    17. Villani, Mattias & Kohn, Robert & Nott, David J., 2012. "Generalized smooth finite mixtures," Journal of Econometrics, Elsevier, vol. 171(2), pages 121-133.
    18. Luca Martino & Jesse Read, 2013. "On the flexibility of the design of multiple try Metropolis schemes," Computational Statistics, Springer, vol. 28(6), pages 2797-2823, December.
    19. Claude Renaux & Laura Buzdugan & Markus Kalisch & Peter Bühlmann, 2020. "Hierarchical inference for genome-wide association studies: a view on methodology with software," Computational Statistics, Springer, vol. 35(1), pages 1-40, March.
    20. Giordani, Paolo & Kohn, Robert, 2008. "Efficient Bayesian Inference for Multiple Change-Point and Mixture Innovation Models," Journal of Business & Economic Statistics, American Statistical Association, vol. 26, pages 66-77, January.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0049445. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.