IDEAS home Printed from https://ideas.repec.org/a/spr/jclass/v35y2018i1d10.1007_s00357-018-9248-z.html
   My bibliography  Save this article

Two-Stage Metropolis-Hastings for Tall Data

Author

Listed:
  • Richard D. Payne

    (3143 Texas A&M University)

  • Bani K. Mallick

    (3143 Texas A&M University)

Abstract

This paper discusses the challenges presented by tall data problems associated with Bayesian classification (specifically binary classification) and the existing methods to handle them. Current methods include parallelizing the likelihood, subsampling, and consensus Monte Carlo. A new method based on the two-stage Metropolis-Hastings algorithm is also proposed. The purpose of this algorithm is to reduce the exact likelihood computational cost in the tall data situation. In the first stage, a new proposal is tested by the approximate likelihood based model. The full likelihood based posterior computation will be conducted only if the proposal passes the first stage screening. Furthermore, this method can be adopted into the consensus Monte Carlo framework. The two-stage method is applied to logistic regression, hierarchical logistic regression, and Bayesian multivariate adaptive regression splines.

Suggested Citation

  • Richard D. Payne & Bani K. Mallick, 2018. "Two-Stage Metropolis-Hastings for Tall Data," Journal of Classification, Springer;The Classification Society, vol. 35(1), pages 29-51, April.
  • Handle: RePEc:spr:jclass:v:35:y:2018:i:1:d:10.1007_s00357-018-9248-z
    DOI: 10.1007/s00357-018-9248-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00357-018-9248-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00357-018-9248-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Shujie Ma & Jeffrey S. Racine & Lijian Yang, 2015. "Spline Regression in the Presence of Categorical Predictors," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 30(5), pages 705-717, August.
    2. Bani K. Mallick & Debashis Ghosh & Malay Ghosh, 2005. "Bayesian classification of tumours by using gene expression data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 219-234, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vasiliy A. Anikin & Yulia P. Lezhnina & Svetlana V. Mareeva & Ekaterina D. Slobodenyuk & Nataliya N. Tikhonovà, 2016. "Income Stratification: Key Approaches and Their Application to Russia," HSE Working papers WP BRP 02/PSP/2016, National Research University Higher School of Economics.
    2. Douglas L. Steinley, 2018. "Editorial," Journal of Classification, Springer;The Classification Society, vol. 35(2), pages 195-197, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2020. "Model uncertainty, nonlinearities and out-of-sample comparison: evidence from international technology diffusion," Working Papers hal-02790523, HAL.
    2. Daniel J. Henderson & Anne-Charlotte Souto & Le Wang, 2020. "Higher-Order Risk–Returns to Education," JRFM, MDPI, vol. 13(11), pages 1-25, October.
    3. Matias Quiroz & Robert Kohn & Mattias Villani & Minh-Ngoc Tran, 2019. "Speeding Up MCMC by Efficient Data Subsampling," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 831-843, April.
    4. Shintaro Yamaguchi, 2013. "Changes in Returns to Task-Specific Skills and Gender Wage Gap," Global COE Hi-Stat Discussion Paper Series gd12-275, Institute of Economic Research, Hitotsubashi University.
    5. Lien, Donald & Hu, Yue & Liu, Long, 2017. "A note on using ratio variables in regression analysis," Economics Letters, Elsevier, vol. 150(C), pages 114-117.
    6. Chakraborty, Sounak, 2009. "Simultaneous cancer classification and gene selection with Bayesian nearest neighbor method: An integrated approach," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1462-1474, February.
    7. Massimiliano Mazzanti & Antonio Musolesi, 2020. "Modeling Green Knowledge Production and Environmental Policies with Semiparametric Panel Data Regression models," SEEDS Working Papers 1420, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Sep 2020.
    8. Gioldasis, Georgios & Musolesi, Antonio & Simioni, Michel, 2023. "Interactive R&D spillovers: An estimation strategy based on forecasting-driven model selection," International Journal of Forecasting, Elsevier, vol. 39(1), pages 144-169.
    9. Luts, Jan & Ormerod, John T., 2014. "Mean field variational Bayesian inference for support vector machine classification," Computational Statistics & Data Analysis, Elsevier, vol. 73(C), pages 163-176.
    10. Nicholas M. Kiefer & Jeffrey S. Racine, 2017. "The smooth colonel and the reverend find common ground," Econometric Reviews, Taylor & Francis Journals, vol. 36(1-3), pages 241-256, March.
    11. Daniel J. Henderson & Anne-Charlotte Souto, 2018. "An Introduction to Nonparametric Regression for Labor Economists," Journal of Labor Research, Springer, vol. 39(4), pages 355-382, December.
    12. Geraldine Henningsen & Arne Henningsen & Christian Henning, 2015. "Transaction costs and social networks in productivity measurement," Empirical Economics, Springer, vol. 48(1), pages 493-515, February.
    13. Christopher F. Parmeter & Jeffrey S. Racine, 2018. "Nonparametric Estimation and Inference for Panel Data Models," Department of Economics Working Papers 2018-02, McMaster University.
    14. Jeffrey S. Racine & Qi Li & Li Zheng, 2018. "Optimal Model Averaging of Mixed-Data Kernel-Weighted Spline Regressions," Department of Economics Working Papers 2018-10, McMaster University.
    15. Clingingsmith, David, 2016. "Negative Emotions, Income, and Welfare: Casual Estimates from the PSID," SocArXiv fae4x, Center for Open Science.
    16. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2018. "Nonparametric estimation of international R&D spillovers," SEEDS Working Papers 0318, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Mar 2018.
    17. Rong Liu & Yichuan Zhao, 2021. "Empirical likelihood inference for generalized additive partially linear models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(3), pages 569-585, September.
    18. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2019. "Nonparametric estimation of R&D international spillovers," Post-Print hal-02789474, HAL.
    19. Boente, Graciela & Martínez, Alejandra Mercedes, 2023. "A robust spline approach in partially linear additive models," Computational Statistics & Data Analysis, Elsevier, vol. 178(C).
    20. Antonio Musolesi & Michel Simioni & Georgios Gioldasis, 2018. "Nonparametric estimation of international R&D spillovers," Working Papers 2018037, University of Ferrara, Department of Economics.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jclass:v:35:y:2018:i:1:d:10.1007_s00357-018-9248-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.