IDEAS home Printed from https://ideas.repec.org/p/kob/dpaper/dp2018-15.html
   My bibliography  Save this paper

Semiparametric Bayes Multiple Imputation for Regression Models with Missing Mixed Continuous-Discrete Covariates

Author

Listed:
  • Ryo Kato

    (Research Institute for Economics & Business Administration (RIEB), Kobe University, Japan)

  • Takahiro Hoshino

    (Department of Economics, Keio University, Japan and RIKEN Center for Advanced Intelligence Project, Japan)

Abstract

Issues regarding missing data are critical in observational and experimental research, as they induce loss of information and biased result. Recently, for datasets with mixed continuous and discrete variables in various study areas, multiple imputation by chained equation (MICE) has been more widely used, although MICE may yield severely biased estimates. We propose a new semiparametric Bayes multiple imputation approach that can deal with continuous and discrete variables. This enables us to overcome the shortcomings of multiple imputation by MICE; they must satisfy strong conditions (known as compatibility) to guarantee that obtained estimators are consistent. Our exhaustive simulation studies show thatthe coverage probability of 95 % interval calculated using MICE can be less than 1 %, while the MSE of the proposed one can be less than one-fiftieth. We also applied our method to the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, and the results are consistent with those of the previous research works that used panel data other than ADNI database, whereas the existing methods such as MICE, resulted in entirely inconsistent results.

Suggested Citation

  • Ryo Kato & Takahiro Hoshino, 2018. "Semiparametric Bayes Multiple Imputation for Regression Models with Missing Mixed Continuous-Discrete Covariates," Discussion Paper Series DP2018-15, Research Institute for Economics & Business Administration, Kobe University.
  • Handle: RePEc:kob:dpaper:dp2018-15
    as

    Download full text from publisher

    File URL: https://www.rieb.kobe-u.ac.jp/academic/ra/dp/English/DP2018-15.pdf
    File Function: First version, 2018
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Joseph G. Ibrahim & Ming-Hui Chen & Stuart R. Lipsitz & Amy H. Herring, 2005. "Missing-Data Methods for Generalized Linear Models: A Comparative Review," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 332-346, March.
    2. Jared S. Murray & Jerome P. Reiter, 2016. "Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1466-1479, October.
    3. Kunihama, T. & Herring, A.H. & Halpern, C.T. & Dunson, D.B., 2016. "Nonparametric Bayes modeling with sample survey weights," Statistics & Probability Letters, Elsevier, vol. 113(C), pages 41-48.
    4. Takahiro Hoshino, 2013. "Semiparametric Bayesian Estimation for Marginal Parametric Potential Outcome Modeling: Application to Causal Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(504), pages 1189-1204, December.
    5. Keisuke Hirano, 2002. "Semiparametric Bayesian Inference in Autoregressive Panel Data Models," Econometrica, Econometric Society, vol. 70(2), pages 781-799, March.
    6. Ming-Hui Chen & Joseph G. Ibrahim & Qi-Man Shao, 2006. "Posterior propriety and computation for the Cox regression model with applications to missing covariates," Biometrika, Biometrika Trust, vol. 93(4), pages 791-807, December.
    7. Stephen G. Walker & Paul Damien & PuruShottam W. Laud & Adrian F. M. Smith, 1999. "Bayesian Nonparametric Inference for Random Distributions and Related Functions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(3), pages 485-527.
    8. David B. Dunson & Natesh Pillai & Ju‐Hyun Park, 2007. "Bayesian density regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 69(2), pages 163-183, April.
    9. Abel Rodríguez & David B. Dunson & Alan E. Gelfand, 2009. "Bayesian nonparametric functional data analysis through density estimation," Biometrika, Biometrika Trust, vol. 96(1), pages 149-162.
    10. Chung, Yeonseung & Dunson, David B., 2009. "Nonparametric Bayes Conditional Distribution Modeling With Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1646-1660.
    11. Royston, Patrick & White, Ian R., 2011. "Multiple Imputation by Chained Equations (MICE): Implementation in Stata," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i04).
    12. Chib, Siddhartha, 2007. "Analysis of treatment response data without the joint distribution of potential outcomes," Journal of Econometrics, Elsevier, vol. 140(2), pages 401-412, October.
    13. Pati, Debdeep & Dunson, David B. & Tokdar, Surya T., 2013. "Posterior consistency in conditional distribution estimation," Journal of Multivariate Analysis, Elsevier, vol. 116(C), pages 456-472.
    14. Jingchen Liu & Andrew Gelman & Jennifer Hill & Yu-Sung Su & Jonathan Kropko, 2014. "On the stationary distribution of iterative imputations," Biometrika, Biometrika Trust, vol. 101(1), pages 155-173.
    15. Kim, Jung Seek & Ratchford, Brian T., 2013. "A Bayesian multivariate probit for ordinal data with semiparametric random-effects," Computational Statistics & Data Analysis, Elsevier, vol. 64(C), pages 192-208.
    16. Weining Shen & Surya T. Tokdar & Subhashis Ghosal, 2013. "Adaptive Bayesian multivariate density estimation with Dirichlet mixtures," Biometrika, Biometrika Trust, vol. 100(3), pages 623-640.
    17. Jing Zhou & Amy H. Herring & Anirban Bhattacharya & Andrew F. Olshan & David B. Dunson, 2016. "Nonparametric Bayes modeling for case control studies with many predictors," Biometrics, The International Biometric Society, vol. 72(1), pages 184-192, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian Instrumental Variables Estimation for Nonignorable Missing Instruments," Discussion Paper Series DP2020-06, Research Institute for Economics & Business Administration, Kobe University.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian multiple imputation for regression models with missing mixed continuous–discrete covariates," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(3), pages 803-825, June.
    2. Ryo Kato & Takahiro Hoshino, 2018. "Semiparametric Bayes Instrumental Variable Estimation with Many Weak Instruments," Discussion Paper Series DP2018-14, Research Institute for Economics & Business Administration, Kobe University.
    3. Griffin, J.E. & Steel, M.F.J., 2011. "Stick-breaking autoregressive processes," Journal of Econometrics, Elsevier, vol. 162(2), pages 383-396, June.
    4. Igari, Ryosuke & Hoshino, Takahiro, 2018. "A Bayesian data combination approach for repeated durations under unobserved missing indicators: Application to interpurchase-timing in marketing," Computational Statistics & Data Analysis, Elsevier, vol. 126(C), pages 150-166.
    5. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian Instrumental Variables Estimation for Nonignorable Missing Instruments," Discussion Paper Series DP2020-06, Research Institute for Economics & Business Administration, Kobe University.
    6. Dandan Xu & Michael J. Daniels & Almut G. Winterstein, 2018. "A Bayesian nonparametric approach to causal inference on quantiles," Biometrics, The International Biometric Society, vol. 74(3), pages 986-996, September.
    7. Debdeep Pati & David Dunson, 2014. "Bayesian nonparametric regression with varying residual density," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(1), pages 1-31, February.
    8. Boyuan Zhang, 2022. "Incorporating Prior Knowledge of Latent Group Structure in Panel Data Models," Papers 2211.16714, arXiv.org, revised Oct 2023.
    9. Pelenis, Justinas, 2014. "Bayesian regression with heteroscedastic error density and parametric mean function," Journal of Econometrics, Elsevier, vol. 178(P3), pages 624-638.
    10. Laura Liu, 2017. "Density Forecasts in Panel Models: A semiparametric Bayesian Perspective," PIER Working Paper Archive 17-006, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania, revised 28 Apr 2017.
    11. Federico Bassetti & Roberto Casarin & Francesco Ravazzolo, 2018. "Bayesian Nonparametric Calibration and Combination of Predictive Distributions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 675-685, April.
    12. Griffin, J. E. & Steel, M. F. J., 2004. "Semiparametric Bayesian inference for stochastic frontier models," Journal of Econometrics, Elsevier, vol. 123(1), pages 121-152, November.
    13. Antonio Canale & Bruno Scarpa, 2016. "Bayesian nonparametric location–scale–shape mixtures," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(1), pages 113-130, March.
    14. Emine Boz & Gita Gopinath & Mikkel Plagborg-Møller, 2017. "Global Trade and the Dollar," NBER Working Papers 23988, National Bureau of Economic Research, Inc.
    15. Barrientos, Andrés F. & Canale, Antonio, 2021. "A Bayesian goodness-of-fit test for regression," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).
    16. Humera Razzak & Christian Heumann, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.
    17. Takahiro Hoshino & Ryosuke Igari, 2017. "Quasi-Bayesian Inference for Latent Variable Models with External Information: Application to generalized linear mixed models for biased data," Keio-IES Discussion Paper Series 2017-014, Institute for Economics Studies, Keio University.
    18. Razzak Humera & Heumann Christian, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.
    19. Huang, Yifan & Meng, Shengwang, 2020. "A Bayesian nonparametric model and its application in insurance loss prediction," Insurance: Mathematics and Economics, Elsevier, vol. 93(C), pages 84-94.
    20. Jing Zhou & Anirban Bhattacharya & Amy H. Herring & David B. Dunson, 2015. "Bayesian Factorizations of Big Sparse Tensors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1562-1576, December.

    More about this item

    Keywords

    Full conditional specification; Missing data; Multiple imputation; Probit stickbreaking process mixture; Semiparametric Bayes model;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:kob:dpaper:dp2018-15. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Office of Promoting Research Collaboration, Research Institute for Economics & Business Administration, Kobe University (email available below). General contact details of provider: https://edirc.repec.org/data/rikobjp.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.