IDEAS home Printed from https://ideas.repec.org/a/spr/sankhb/v86y2024i1d10.1007_s13571-023-00317-5.html
   My bibliography  Save this article

Diagnostic Test for Realized Missingness in Mixed-type Data

Author

Listed:
  • Ruizhe Chen

    (Johns Hopkins University)

  • Yu-Che Chung

    (Takeda Pharmaceuticals)

  • Sanjib Basu

    (University of Illinois Chicago)

  • Qian Shi

    (Mayo Clinic)

Abstract

A frequent concern in analyzing incomplete multivariate measurements in mixed categorical and quantitative scales is whether missing completely at random (MCAR) is an appropriate model. Realized MCAR refers to constancy of conditional probability at realized missing data patterns and differs from always MCAR. We develop a scalable approach for diagnostics of realized MCAR in mixed-type data for which existing methods are lacking. We interestingly establish that the null framework may hold under the broader condition of observed at random (OAR) under component independence and the method cannot detect departure in the direction of OAR under independence but may do so under dependence. We demonstrate that the proposed method is easy to implement and scalable. In the special case of non-mixed type data, we face computational difficulties with existing methods whereas the proposed approach performs superiorly. The proposed approach is applied to analyze incomplete mixed-type data from the ARCAD metastatic colorectal cancer database.

Suggested Citation

  • Ruizhe Chen & Yu-Che Chung & Sanjib Basu & Qian Shi, 2024. "Diagnostic Test for Realized Missingness in Mixed-type Data," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 86(1), pages 109-138, May.
  • Handle: RePEc:spr:sankhb:v:86:y:2024:i:1:d:10.1007_s13571-023-00317-5
    DOI: 10.1007/s13571-023-00317-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13571-023-00317-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13571-023-00317-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Mortaza Jamshidian & Siavash Jalal, 2010. "Tests of Homoscedasticity, Normality, and Missing Completely at Random for Incomplete Multivariate Data," Psychometrika, Springer;The Psychometric Society, vol. 75(4), pages 649-674, December.
    2. Fabrizia Mealli & Donald B. Rubin, 2015. "Clarifying missing at random and related definitions, and implications when coupled with exchangeability," Biometrika, Biometrika Trust, vol. 102(4), pages 995-1000.
    3. Jun Li & Yao Yu, 2015. "A Nonparametric Test of Missing Completely at Random for Incomplete Multivariate Data," Psychometrika, Springer;The Psychometric Society, vol. 80(3), pages 707-726, September.
    4. Iavor I Bojinov & Natesh S Pillai & Donald B Rubin, 2020. "Diagnosing missing always at random in multivariate data," Biometrika, Biometrika Trust, vol. 107(1), pages 246-253.
    5. Kevin Kim & Peter Bentler, 2002. "Tests of homogeneity of means and covariance matrices for multivariate incomplete data," Psychometrika, Springer;The Psychometric Society, vol. 67(4), pages 609-623, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ke-Hai Yuan & Mortaza Jamshidian & Yutaka Kano, 2018. "Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances," Psychometrika, Springer;The Psychometric Society, vol. 83(2), pages 425-442, June.
    2. Hairu Wang & Zhiping Lu & Yukun Liu, 2023. "Score test for missing at random or not under logistic missingness models," Biometrics, The International Biometric Society, vol. 79(2), pages 1268-1279, June.
    3. Nobumichi Shutoh & Takahiro Nishiyama & Masashi Hyodo, 2017. "Bartlett correction to the likelihood ratio test for MCAR with two-step monotone sample," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 71(3), pages 184-199, August.
    4. Chassan, Malika & Concordet, Didier, 2023. "How to test the missing data mechanism in a hidden Markov model," Computational Statistics & Data Analysis, Elsevier, vol. 182(C).
    5. Jun Li & Yao Yu, 2015. "A Nonparametric Test of Missing Completely at Random for Incomplete Multivariate Data," Psychometrika, Springer;The Psychometric Society, vol. 80(3), pages 707-726, September.
    6. Jamshidian, Mortaza & Jalal, Siavash & Jansen, Camden, 2014. "MissMech: An R Package for Testing Homoscedasticity, Multivariate Normality, and Missing Completely at Random (MCAR)," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 56(i06).
    7. Ali, Saif & Arora, Gaurav, 2021. "Well-level Missingness Mechanisms in Administrative Groundwater Monitoring Data for Uttar Pradesh (UP), India, 2009-2018," 2021 Annual Meeting, August 1-3, Austin, Texas 314038, Agricultural and Applied Economics Association.
    8. repec:osf:osfxxx:f9jvz_v1 is not listed on IDEAS
    9. Fei Wang & Yuhao Deng, 2023. "Non-Asymptotic Bounds of AIPW Estimators for Means with Missingness at Random," Mathematics, MDPI, vol. 11(4), pages 1-14, February.
    10. Festus O. Amadu & Daniel C. Miller, 2024. "Food security effects of forest sector participation in rural Liberia," Food Security: The Science, Sociology and Economics of Food Production and Access to Food, Springer;The International Society for Plant Pathology, vol. 16(5), pages 1099-1124, October.
    11. Marco Doretti & Sara Geneletti & Elena Stanghellini, 2018. "Missing Data: A Unified Taxonomy Guided by Conditional Independence," International Statistical Review, International Statistical Institute, vol. 86(2), pages 189-204, August.
    12. Francesco Bartolucci & Donata Favaro & Fulvia Pennoni & Dario Sciulli, 2024. "An Analysis of the Effect of Streaming on Civic Participation Through a Causal Hidden Markov Model," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 172(1), pages 163-190, March.
    13. Wei Liu & Zhiwei Zhang & Lei Nie & Guoxing Soon, 2017. "A Case Study in Personalized Medicine: Rilpivirine Versus Efavirenz for Treatment-Naive HIV Patients," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1381-1392, October.
    14. Boris Forthmann & Mark A. Runco, 2020. "An Empirical Test of the Inter-Relationships between Various Bibliometric Creative Scholarship Indicators," Publications, MDPI, vol. 8(2), pages 1-16, June.
    15. Shen‐Ming Lee & Wen‐Han Hwang & Jean de Dieu Tapsoba, 2016. "Estimation in closed capture–recapture models when covariates are missing at random," Biometrics, The International Biometric Society, vol. 72(4), pages 1294-1304, December.
    16. Florian M. Hollenbach & Iavor Bojinov & Shahryar Minhas & Nils W. Metternich & Michael D. Ward & Alexander Volfovsky, 2021. "Multiple Imputation Using Gaussian Copulas," Sociological Methods & Research, , vol. 50(3), pages 1259-1283, August.
    17. Ziyang Lyu, 2024. "Analysis of estimating the Bayes rule for Gaussian mixture models with a specified missing-data mechanism," Computational Statistics, Springer, vol. 39(7), pages 3727-3751, December.
    18. Aidan G. O’Keeffe & Daniel M. Farewell & Brian D. M. Tom & Vernon T. Farewell, 2016. "Multiple Imputation of Missing Composite Outcomes in Longitudinal Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 8(2), pages 310-332, October.
    19. D. M. Farewell & C. Huang & V. Didelez, 2017. "Ignorability for general longitudinal data," Biometrika, Biometrika Trust, vol. 104(2), pages 317-326.
    20. Haiyan Yu & Bing Han & Nicholas Rios & Jianbin Chen, 2024. "Missing Data Imputation in Balanced Construction for Incomplete Block Designs," Mathematics, MDPI, vol. 12(21), pages 1-22, October.
    21. Frahm, Gabriel & Nordhausen, Klaus & Oja, Hannu, 2020. "M-estimation with incomplete and dependent multivariate data," Journal of Multivariate Analysis, Elsevier, vol. 176(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:sankhb:v:86:y:2024:i:1:d:10.1007_s13571-023-00317-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.