IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v65y2024i2d10.1007_s00362-023-01405-4.html
   My bibliography  Save this article

A method of correction for heaping error in the variables using validation data

Author

Listed:
  • Amar S. Ahmad

    (New York University)

  • Munther Al-Hassan

    (Dubai Men’s College)

  • Hamid Y. Hussain

    (Dubai Health Authority)

  • Nirmin F. Juber

    (New York University)

  • Fred N. Kiwanuka

    (Dubai Men’s College)

  • Mohammed Hag-Ali

    (Higher Colleges of Technology)

  • Raghib Ali

    (New York University)

Abstract

When self-reported data are used in statistical analysis to estimate the mean and variance, as well as the regression parameters, the estimates tend, in many cases, to be biased. This is because interviewees have a tendency to heap their answers to certain values. The aim of the paper is to examine the bias-inducing effect of the heaping error in self-reported data, and study the effect on the heaping error on the mean and variance of a distribution as well as the regression parameters. As a result a new method is introduced to correct the effects of bias due to the heaping error using validation data. Using publicly available data and simulation studies, it can be shown that the newly developed method is practical and can easily be applied to correct the bias in the estimated mean and variance, as well as in the estimated regression parameters computed from self-reported data. Hence, using the method of correction presented in this paper allows researchers to draw accurate conclusions leading to the right decisions, e.g. regarding health care planning and delivery.

Suggested Citation

  • Amar S. Ahmad & Munther Al-Hassan & Hamid Y. Hussain & Nirmin F. Juber & Fred N. Kiwanuka & Mohammed Hag-Ali & Raghib Ali, 2024. "A method of correction for heaping error in the variables using validation data," Statistical Papers, Springer, vol. 65(2), pages 687-704, April.
  • Handle: RePEc:spr:stpapr:v:65:y:2024:i:2:d:10.1007_s00362-023-01405-4
    DOI: 10.1007/s00362-023-01405-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-023-01405-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-023-01405-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Carlo G. Camarda & Paul H. C. Eilers & Jutta Gampe, 2017. "Modelling trends in digit preference patterns," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 66(5), pages 893-918, November.
    2. Klerman, J.A., 1993. "Heaping in Retrospective Data: Insights from Malaysia Family Life Survey's Breastfeeding Data," Papers 93-21, RAND - Labor and Population Program.
    3. repec:plo:pone00:0214941 is not listed on IDEAS
    4. John Haaga, 1988. "Reliability of retrospective survey data on infant feeding," Demography, Springer;Population Association of America (PAA), vol. 25(2), pages 307-314, May.
    5. Torelli, Nicola & Trivellato, Ugo, 1993. "Modelling inaccuracies in job-search duration data," Journal of Econometrics, Elsevier, vol. 59(1-2), pages 187-211, September.
    6. Fengyi Lin & Liming Guan & Wenchang Fang, 2011. "Heaping in Reported Earnings: Evidence from Monthly Financial Reports of Taiwanese Firms," Emerging Markets Finance and Trade, Taylor & Francis Journals, vol. 47(2), pages 62-73, March.
    7. Thomas Augustin & Joachim Wolff, 2004. "A bias analysis of Weibull models under heaped data," Statistical Papers, Springer, vol. 45(2), pages 211-229, April.
    8. Alan I. Barreca & Melanie Guldi & Jason M. Lindo & Glen R. Waddell, 2011. "Saving Babies? Revisiting the effect of very low birth weight classification," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 126(4), pages 2117-2123.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Arulampalam, Wiji & Corradi, Valentina & Gutknecht, Daniel, 2017. "Modeling heaped duration data: An application to neonatal mortality," Journal of Econometrics, Elsevier, vol. 200(2), pages 363-377.
    2. Hope Corman & Dhaval Dave & Nancy E. Reichman, 2018. "Evolution of the Infant Health Production Function," Southern Economic Journal, John Wiley & Sons, vol. 85(1), pages 6-47, July.
    3. David Madden, 2002. "Do Tobacco Taxes Influence Starting and Quitting Smoking? A Discrete Choice Approach Using Evidence from a Sample of Irish Women," Working Papers 200205, School of Economics, University College Dublin.
    4. Kim, Jinyoung & Kim, Seonghoon & Koh, Kanghyock, 2022. "Labor market institutions and the incidence of payroll taxation," Journal of Public Economics, Elsevier, vol. 209(C).
    5. Jorma J. Schäublin, 2022. "Swiss pension funds: funding ratio, discount rate, and asset allocation," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 158(1), pages 1-23, December.
    6. Luc Behaghel & Maria Florencia Pinto, 2024. "Extended maternity leave and children's long‐term development," Scandinavian Journal of Economics, Wiley Blackwell, vol. 126(2), pages 224-253, April.
    7. Erich Battistin & Raffaele Miniaci & Guglielmo Weber, 2003. "What Do We Learn from Recall Consumption Data?," Journal of Human Resources, University of Wisconsin Press, vol. 38(2).
    8. Adam C. Sales & Ben B. Hansen, 2020. "Limitless Regression Discontinuity," Journal of Educational and Behavioral Statistics, , vol. 45(2), pages 143-174, April.
    9. Drouvelis, Michalis & Marx, Benjamin M., 2022. "Can charitable appeals identify and exploit belief heterogeneity?," Journal of Economic Behavior & Organization, Elsevier, vol. 198(C), pages 631-649.
    10. Dang, Hai-Anh H. & Trinh, Trong-Anh, 2021. "Does the COVID-19 lockdown improve global air quality? New cross-national evidence on its unintended consequences," Journal of Environmental Economics and Management, Elsevier, vol. 105(C).
    11. Machin, Stephen & McNally, Sandra & Ruiz-Valenzuela, Jenifer, 2020. "Entry through the narrow door: The costs of just failing high stakes exams," Journal of Public Economics, Elsevier, vol. 190(C).
    12. Dahlberg, Matz & Mani, Kevin & Öhman, Mattias & Wanhainen, Anders, 2016. "Health Information and Well-Being: Evidence from an Asymptomatic Disease," Working Paper Series 2016:2, Uppsala University, Department of Economics.
    13. Abdurrahman B. Aydemir & Murat Güray Kırdar & Huzeyfe Torun, 2019. "The Effect of Education on Geographic Mobility: Incidence, Timing, and Type of Migration," RF Berlin - CReAM Discussion Paper Series 1914, Rockwool Foundation Berlin (RF Berlin) - Centre for Research and Analysis of Migration (CReAM).
    14. Hansen, Benjamin & Miller, Keaton & Weber, Caroline, 2020. "Federalism, partial prohibition, and cross-border sales: Evidence from recreational marijuana," Journal of Public Economics, Elsevier, vol. 187(C).
    15. Catherine Hausman & David S. Rapson, 2018. "Regression Discontinuity in Time: Considerations for Empirical Applications," Annual Review of Resource Economics, Annual Reviews, vol. 10(1), pages 533-552, October.
    16. Meri Davlasheridze & Qing Miao, 2021. "Natural disasters, public housing, and the role of disaster aid," Journal of Regional Science, Wiley Blackwell, vol. 61(5), pages 1113-1135, November.
    17. Hie Joo Ahn & James Hamilton, 2022. "Measuring Labor-Force Participation and the Incidence and Duration of Unemployment," Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 44, pages 1-32, April.
    18. Martin Ellison & Sang Seok Lee & Kevin Hjortshøj O'Rourke, 2024. "The Ends of 27 Big Depressions," American Economic Review, American Economic Association, vol. 114(1), pages 134-168, January.
    19. Prashant Bharadwaj & Katrine Vellesen L?ken & Christopher Neilson, 2013. "Early Life Health Interventions and Academic Achievement," American Economic Review, American Economic Association, vol. 103(5), pages 1862-1891, August.
    20. Vieira, Renato. S. & Pereira, Rafael H. M. & Emanuel, Lucas & Alves, Pedro Jorge, 2025. "The Effects of Fare-Free Transit on the Travel Behavior of Older Adults," OSF Preprints pywm7_v1, Center for Open Science.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:65:y:2024:i:2:d:10.1007_s00362-023-01405-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.