IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v13y2025i11p1728-d1663304.html
   My bibliography  Save this article

Exploring a Diagnostic Test for Missingness at Random

Author

Listed:
  • Dominick Sutton

    (School of Geographical & Earth Sciences, University of Glasgow, Glasgow G12 8QQ, UK)

  • Anahid Basiri

    (School of Geographical & Earth Sciences, University of Glasgow, Glasgow G12 8QQ, UK)

  • Ziqi Li

    (Department of Geography, Florida State University, Tallahassee, FL 32306, USA)

Abstract

Missing data remain a challenge for researchers and decision-makers due to their impact on analytical accuracy and uncertainty estimation. Many studies on missing data are based on randomness, but randomness itself is problematic. This makes it difficult to identify missing data mechanisms and affects how effectively the missing data impacts can be minimized. The purpose of this paper is to examine a potentially simple test to diagnose whether the missing data are missing at random. Such a test is developed using an extended taxonomy of missing data mechanisms. A key aspect of the approach is the use of single mean imputation for handling missing data in the test development dataset. Changing this to random imputation from the same underlying distribution, however, has a negative impact on the diagnosis. This is aggravated by the possibility of high inter-variable correlation, confounding, and mixed missing data mechanisms. The verification step uses data from a high-quality real-world dataset and finds some evidence—in one case—that the data may be missing at random, but this is less persuasive in the second case. Confidence in these results, however, is limited by the potential influence of correlation, confounding, and mixed missingness. This paper concludes with a discussion of the test’s merits and finds that sufficient uncertainties remain to render it unreliable, even if the initial results appear promising.

Suggested Citation

  • Dominick Sutton & Anahid Basiri & Ziqi Li, 2025. "Exploring a Diagnostic Test for Missingness at Random," Mathematics, MDPI, vol. 13(11), pages 1-28, May.
  • Handle: RePEc:gam:jmathe:v:13:y:2025:i:11:p:1728-:d:1663304
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/13/11/1728/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/13/11/1728/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Marco Doretti & Sara Geneletti & Elena Stanghellini, 2018. "Missing Data: A Unified Taxonomy Guided by Conditional Independence," International Statistical Review, International Statistical Institute, vol. 86(2), pages 189-204, August.
    2. Najib Ur Rehman & Ivan Contreras & Aleix Beneyto & Josep Vehi, 2024. "The Impact of Missing Continuous Blood Glucose Samples on Machine Learning Models for Predicting Postprandial Hypoglycemia: An Experimental Analysis," Mathematics, MDPI, vol. 12(10), pages 1-23, May.
    3. Kim, H.-J. & Fredriksen-Goldsen, K.I., 2013. "Nonresponse to a question on self-identified sexual orientation in a public health survey and its relationship to race and ethnicity," American Journal of Public Health, American Public Health Association, vol. 103(1), pages 67-69.
    4. Nitzan Cohen & Yakir Berchenko, 2021. "Normalized Information Criteria and Model Selection in the Presence of Missing Data," Mathematics, MDPI, vol. 9(19), pages 1-23, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Thakur Narendra Singh & Shukla Diwakar, 2022. "Missing data estimation based on the chaining technique in survey sampling," Statistics in Transition New Series, Statistics Poland, vol. 23(4), pages 91-111, December.
    2. Nitzan Cohen & Yakir Berchenko, 2021. "Normalized Information Criteria and Model Selection in the Presence of Missing Data," Mathematics, MDPI, vol. 9(19), pages 1-23, October.
    3. Gilbert Gonzales & Jesse M. Ehrenfeld, 2018. "The Association between State Policy Environments and Self-Rated Health Disparities for Sexual Minorities in the United States," IJERPH, MDPI, vol. 15(6), pages 1-11, June.
    4. Fernando Ruiz Vallejo & Diederik Boertien, 2021. "Do same-sex unions dissolve more often than different-sex unions? Methodological insights from Colombian data on sexual behavior," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 44(48), pages 1149-1164.
    5. Cameron Deal & Shea Greenberg & Gilbert Gonzales, 2024. "Sexual identity, poverty, and utilization of government services," Journal of Population Economics, Springer;European Society for Population Economics, vol. 37(2), pages 1-31, June.
    6. Mehboob Ali & Göran Kauermann, 2021. "A split questionnaire survey design in the context of statistical matching," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(4), pages 1219-1236, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:13:y:2025:i:11:p:1728-:d:1663304. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.