IDEAS home Printed from https://ideas.repec.org/a/spr/jclass/v35y2018i1d10.1007_s00357-018-9248-z.html
   My bibliography  Save this article

Two-Stage Metropolis-Hastings for Tall Data

Author

Listed:
  • Richard D. Payne

    (3143 Texas A&M University)

  • Bani K. Mallick

    (3143 Texas A&M University)

Abstract

This paper discusses the challenges presented by tall data problems associated with Bayesian classification (specifically binary classification) and the existing methods to handle them. Current methods include parallelizing the likelihood, subsampling, and consensus Monte Carlo. A new method based on the two-stage Metropolis-Hastings algorithm is also proposed. The purpose of this algorithm is to reduce the exact likelihood computational cost in the tall data situation. In the first stage, a new proposal is tested by the approximate likelihood based model. The full likelihood based posterior computation will be conducted only if the proposal passes the first stage screening. Furthermore, this method can be adopted into the consensus Monte Carlo framework. The two-stage method is applied to logistic regression, hierarchical logistic regression, and Bayesian multivariate adaptive regression splines.

Suggested Citation

  • Richard D. Payne & Bani K. Mallick, 2018. "Two-Stage Metropolis-Hastings for Tall Data," Journal of Classification, Springer;The Classification Society, vol. 35(1), pages 29-51, April.
  • Handle: RePEc:spr:jclass:v:35:y:2018:i:1:d:10.1007_s00357-018-9248-z
    DOI: 10.1007/s00357-018-9248-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00357-018-9248-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00357-018-9248-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Shujie Ma & Jeffrey S. Racine & Lijian Yang, 2015. "Spline Regression in the Presence of Categorical Predictors," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 30(5), pages 705-717, August.
    2. Bani K. Mallick & Debashis Ghosh & Malay Ghosh, 2005. "Bayesian classification of tumours by using gene expression data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 219-234, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vasiliy A. Anikin & Yulia P. Lezhnina & Svetlana V. Mareeva & Ekaterina D. Slobodenyuk & Nataliya N. TikhonovĂ , 2016. "Income Stratification: Key Approaches and Their Application to Russia," HSE Working papers WP BRP 02/PSP/2016, National Research University Higher School of Economics.
    2. Douglas L. Steinley, 2018. "Editorial," Journal of Classification, Springer;The Classification Society, vol. 35(2), pages 195-197, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matias Quiroz & Robert Kohn & Mattias Villani & Minh-Ngoc Tran, 2019. "Speeding Up MCMC by Efficient Data Subsampling," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 831-843, April.
    2. Shintaro Yamaguchi, 2013. "Changes in Returns to Task-Specific Skills and Gender Wage Gap," Global COE Hi-Stat Discussion Paper Series gd12-275, Institute of Economic Research, Hitotsubashi University.
    3. Nicholas M. Kiefer & Jeffrey S. Racine, 2017. "The smooth colonel and the reverend find common ground," Econometric Reviews, Taylor & Francis Journals, vol. 36(1-3), pages 241-256, March.
    4. Daniel J. Henderson & Anne-Charlotte Souto, 2018. "An Introduction to Nonparametric Regression for Labor Economists," Journal of Labor Research, Springer, vol. 39(4), pages 355-382, December.
    5. Geraldine Henningsen & Arne Henningsen & Christian Henning, 2015. "Transaction costs and social networks in productivity measurement," Empirical Economics, Springer, vol. 48(1), pages 493-515, February.
    6. Christopher F. Parmeter & Jeffrey S. Racine, 2018. "Nonparametric Estimation and Inference for Panel Data Models," Department of Economics Working Papers 2018-02, McMaster University.
    7. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2018. "Nonparametric estimation of international R&D spillovers," SEEDS Working Papers 0318, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Mar 2018.
    8. Clingingsmith, David, 2017. "Negative Emotions, Income, and Welfare: Causal Estimates from the PSID," SocArXiv q2mxt, Center for Open Science.
    9. Jeffrey S. Racine, 2016. "A Correction to "Generalized Nonparametric Smoothing with Mixed Discrete and Continuous Data" by Li, Simar & Zelenyuk (2014, CSDA)," Department of Economics Working Papers 2016-01, McMaster University.
    10. Paudel, Krishna P. & Lin, C.-Y. Cynthia & Pandit, Mahesh, 2014. "Environmental Kuznets Curve for Water Quality Parameters at Global Level," 2014 Annual Meeting, February 1-4, 2014, Dallas, Texas 162618, Southern Agricultural Economics Association.
    11. Dixit, Anand & Roy, Vivekananda, 2021. "Posterior impropriety of some sparse Bayesian learning models," Statistics & Probability Letters, Elsevier, vol. 171(C).
    12. Massimiliano Mazzanti & Antonio Musolesi, 2020. "A Semiparametric Analysis of Green Inventions and Environmental Policies," SEEDS Working Papers 0920, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Jun 2020.
    13. Daiqiang Zhang, 2021. "Testing Passive Versus Symmetric Beliefs In Contracting With Externalities," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 62(2), pages 723-767, May.
    14. Shujie Ma & Jeffrey S. Racine & Aman Ullah, 2015. "Nonparametric Regression-Spline Random Effects Models," Department of Economics Working Papers 2015-10, McMaster University.
    15. Jean-Thomas Bernard & Michael Gavin & Lynda Khalaf & Marcel Voia, 2015. "Environmental Kuznets Curve: Tipping Points, Uncertainty and Weak Identification," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 60(2), pages 285-315, February.
    16. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2020. "Model uncertainty, nonlinearities and out-of-sample comparison: evidence from international technology diffusion," Working Papers hal-02790523, HAL.
    17. Daniel J. Henderson & Anne-Charlotte Souto & Le Wang, 2020. "Higher-Order Risk–Returns to Education," JRFM, MDPI, vol. 13(11), pages 1-25, October.
    18. Lien, Donald & Hu, Yue & Liu, Long, 2017. "A note on using ratio variables in regression analysis," Economics Letters, Elsevier, vol. 150(C), pages 114-117.
    19. Chakraborty, Sounak, 2009. "Simultaneous cancer classification and gene selection with Bayesian nearest neighbor method: An integrated approach," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1462-1474, February.
    20. Massimiliano Mazzanti & Antonio Musolesi, 2020. "Modeling Green Knowledge Production and Environmental Policies with Semiparametric Panel Data Regression models," SEEDS Working Papers 1420, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Sep 2020.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jclass:v:35:y:2018:i:1:d:10.1007_s00357-018-9248-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.