IDEAS home Printed from https://ideas.repec.org/a/spr/aistmt/v72y2020i3d10.1007_s10463-019-00710-w.html
   My bibliography  Save this article

Semiparametric Bayesian multiple imputation for regression models with missing mixed continuous–discrete covariates

Author

Listed:
  • Ryo Kato

    (Kobe University)

  • Takahiro Hoshino

    (Keio University
    RIKEN Center for Advanced Intelligence Project)

Abstract

Issues regarding missing data are critical in observational and experimental research. Recently, for datasets with mixed continuous–discrete variables, multiple imputation by chained equation (MICE) has been widely used, although MICE may yield severely biased estimates. We propose a new semiparametric Bayes multiple imputation approach that can deal with continuous and discrete variables. This enables us to overcome the shortcomings of MICE; they must satisfy strong conditions (known as compatibility) to guarantee obtained estimators are consistent. Our simulation studies show the coverage probability of 95% interval calculated using MICE can be less than 1%, while the MSE of the proposed can be less than one-fiftieth. We applied our method to the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, and the results are consistent with those of the previous works that used panel data other than ADNI database, whereas the existing methods, such as MICE, resulted in inconsistent results.

Suggested Citation

  • Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian multiple imputation for regression models with missing mixed continuous–discrete covariates," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(3), pages 803-825, June.
  • Handle: RePEc:spr:aistmt:v:72:y:2020:i:3:d:10.1007_s10463-019-00710-w
    DOI: 10.1007/s10463-019-00710-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10463-019-00710-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10463-019-00710-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Takahiro Hoshino, 2013. "Semiparametric Bayesian Estimation for Marginal Parametric Potential Outcome Modeling: Application to Causal Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(504), pages 1189-1204, December.
    2. Keisuke Hirano, 2002. "Semiparametric Bayesian Inference in Autoregressive Panel Data Models," Econometrica, Econometric Society, vol. 70(2), pages 781-799, March.
    3. Stephen G. Walker & Paul Damien & PuruShottam W. Laud & Adrian F. M. Smith, 1999. "Bayesian Nonparametric Inference for Random Distributions and Related Functions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(3), pages 485-527.
    4. Zhiwei Zhang & Howard Rockette, 2006. "Semiparametric Maximum Likelihood for Missing Covariates in Parametric Regression," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 58(4), pages 687-706, December.
    5. Chung, Yeonseung & Dunson, David B., 2009. "Nonparametric Bayes Conditional Distribution Modeling With Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1646-1660.
    6. McCulloch, Robert & Rossi, Peter E., 1994. "An exact likelihood analysis of the multinomial probit model," Journal of Econometrics, Elsevier, vol. 64(1-2), pages 207-240.
    7. J. F. Lawless & J. D. Kalbfleisch & C. J. Wild, 1999. "Semiparametric methods for response‐selective and missing data problems in regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(2), pages 413-438, April.
    8. Pati, Debdeep & Dunson, David B. & Tokdar, Surya T., 2013. "Posterior consistency in conditional distribution estimation," Journal of Multivariate Analysis, Elsevier, vol. 116(C), pages 456-472.
    9. Jingchen Liu & Andrew Gelman & Jennifer Hill & Yu-Sung Su & Jonathan Kropko, 2014. "On the stationary distribution of iterative imputations," Biometrika, Biometrika Trust, vol. 101(1), pages 155-173.
    10. Ishwaran H. & James L. F, 2001. "Gibbs Sampling Methods for Stick Breaking Priors," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 161-173, March.
    11. Kim, Jung Seek & Ratchford, Brian T., 2013. "A Bayesian multivariate probit for ordinal data with semiparametric random-effects," Computational Statistics & Data Analysis, Elsevier, vol. 64(C), pages 192-208.
    12. Brian J. Reich & Howard D. Bondell & Lexin Li, 2011. "Sufficient Dimension Reduction via Bayesian Mixture Modeling," Biometrics, The International Biometric Society, vol. 67(3), pages 886-895, September.
    13. Joseph G. Ibrahim & Ming-Hui Chen & Stuart R. Lipsitz & Amy H. Herring, 2005. "Missing-Data Methods for Generalized Linear Models: A Comparative Review," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 332-346, March.
    14. Jared S. Murray & Jerome P. Reiter, 2016. "Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1466-1479, October.
    15. Kunihama, T. & Herring, A.H. & Halpern, C.T. & Dunson, D.B., 2016. "Nonparametric Bayes modeling with sample survey weights," Statistics & Probability Letters, Elsevier, vol. 113(C), pages 41-48.
    16. Ming-Hui Chen & Joseph G. Ibrahim & Qi-Man Shao, 2006. "Posterior propriety and computation for the Cox regression model with applications to missing covariates," Biometrika, Biometrika Trust, vol. 93(4), pages 791-807, December.
    17. David B. Dunson & Natesh Pillai & Ju‐Hyun Park, 2007. "Bayesian density regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 69(2), pages 163-183, April.
    18. Abel Rodríguez & David B. Dunson & Alan E. Gelfand, 2009. "Bayesian nonparametric functional data analysis through density estimation," Biometrika, Biometrika Trust, vol. 96(1), pages 149-162.
    19. Chib, Siddhartha, 2007. "Analysis of treatment response data without the joint distribution of potential outcomes," Journal of Econometrics, Elsevier, vol. 140(2), pages 401-412, October.
    20. James H. Albert & Siddhartha Chib, 2001. "Sequential Ordinal Modeling with Applications to Survival Data," Biometrics, The International Biometric Society, vol. 57(3), pages 829-836, September.
    21. Weining Shen & Surya T. Tokdar & Subhashis Ghosal, 2013. "Adaptive Bayesian multivariate density estimation with Dirichlet mixtures," Biometrika, Biometrika Trust, vol. 100(3), pages 623-640.
    22. Zhang, Xiao & Boscardin, W. John & Belin, Thomas R., 2008. "Bayesian analysis of multivariate nominal measures using multivariate multinomial probit models," Computational Statistics & Data Analysis, Elsevier, vol. 52(7), pages 3697-3708, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian Instrumental Variables Estimation for Nonignorable Missing Instruments," Discussion Paper Series DP2020-06, Research Institute for Economics & Business Administration, Kobe University.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ryo Kato & Takahiro Hoshino, 2018. "Semiparametric Bayes Multiple Imputation for Regression Models with Missing Mixed Continuous-Discrete Covariates," Discussion Paper Series DP2018-15, Research Institute for Economics & Business Administration, Kobe University.
    2. Ryo Kato & Takahiro Hoshino, 2018. "Semiparametric Bayes Instrumental Variable Estimation with Many Weak Instruments," Discussion Paper Series DP2018-14, Research Institute for Economics & Business Administration, Kobe University.
    3. Igari, Ryosuke & Hoshino, Takahiro, 2018. "A Bayesian data combination approach for repeated durations under unobserved missing indicators: Application to interpurchase-timing in marketing," Computational Statistics & Data Analysis, Elsevier, vol. 126(C), pages 150-166.
    4. Debdeep Pati & David Dunson, 2014. "Bayesian nonparametric regression with varying residual density," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(1), pages 1-31, February.
    5. Boyuan Zhang, 2022. "Incorporating Prior Knowledge of Latent Group Structure in Panel Data Models," Papers 2211.16714, arXiv.org, revised Oct 2023.
    6. Laura Liu, 2018. "Density Forecasts in Panel Data Models : A Semiparametric Bayesian Perspective," Finance and Economics Discussion Series 2018-036, Board of Governors of the Federal Reserve System (U.S.).
    7. Griffin, J. E. & Steel, M. F. J., 2004. "Semiparametric Bayesian inference for stochastic frontier models," Journal of Econometrics, Elsevier, vol. 123(1), pages 121-152, November.
    8. Griffin, J.E. & Steel, M.F.J., 2011. "Stick-breaking autoregressive processes," Journal of Econometrics, Elsevier, vol. 162(2), pages 383-396, June.
    9. Takahiro Hoshino & Ryosuke Igari, 2017. "Quasi-Bayesian Inference for Latent Variable Models with External Information: Application to generalized linear mixed models for biased data," Keio-IES Discussion Paper Series 2017-014, Institute for Economics Studies, Keio University.
    10. Jaeeun Yu & Jinsu Park & Taeryon Choi & Masahiro Hashizume & Yoonhee Kim & Yasushi Honda & Yeonseung Chung, 2021. "Nonparametric Bayesian Functional Meta-Regression: Applications in Environmental Epidemiology," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 26(1), pages 45-70, March.
    11. Chib, Siddhartha & Greenberg, Edward, 2010. "Additive cubic spline regression with Dirichlet process mixture errors," Journal of Econometrics, Elsevier, vol. 156(2), pages 322-336, June.
    12. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian Instrumental Variables Estimation for Nonignorable Missing Instruments," Discussion Paper Series DP2020-06, Research Institute for Economics & Business Administration, Kobe University.
    13. Pelenis, Justinas, 2014. "Bayesian regression with heteroscedastic error density and parametric mean function," Journal of Econometrics, Elsevier, vol. 178(P3), pages 624-638.
    14. Dandan Xu & Michael J. Daniels & Almut G. Winterstein, 2018. "A Bayesian nonparametric approach to causal inference on quantiles," Biometrics, The International Biometric Society, vol. 74(3), pages 986-996, September.
    15. Rub'en Loaiza-Maya & Didier Nibbering, 2022. "Fast variational Bayes methods for multinomial probit models," Papers 2202.12495, arXiv.org, revised Oct 2022.
    16. Laura Liu & Hyungsik Roger Moon & Frank Schorfheide, 2023. "Forecasting with a panel Tobit model," Quantitative Economics, Econometric Society, vol. 14(1), pages 117-159, January.
    17. Pati, Debdeep & Dunson, David B. & Tokdar, Surya T., 2013. "Posterior consistency in conditional distribution estimation," Journal of Multivariate Analysis, Elsevier, vol. 116(C), pages 456-472.
    18. Abel Rodriguez & Enrique ter Horst, 2008. "Measuring expectations in options markets: An application to the SP500 index," Papers 0901.0033, arXiv.org.
    19. Laura Liu, 2017. "Density Forecasts in Panel Models: A semiparametric Bayesian Perspective," PIER Working Paper Archive 17-006, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania, revised 28 Apr 2017.
    20. Federico Bassetti & Roberto Casarin & Francesco Ravazzolo, 2018. "Bayesian Nonparametric Calibration and Combination of Predictive Distributions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 675-685, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:aistmt:v:72:y:2020:i:3:d:10.1007_s10463-019-00710-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.