IDEAS home Printed from https://ideas.repec.org/a/spr/astaws/v17y2023i3d10.1007_s11943-023-00331-z.html
   My bibliography  Save this article

Exploring quality dimensions in trustworthy Machine Learning in the context of official statistics: model explainability and uncertainty quantification

Author

Listed:
  • Saeid Molladavoudi

    (Statistics Canada)

  • Wesley Yung

    (Statistics Canada)

Abstract

Despite the fact that National Statistical Offices (NSOs) continue to embrace and adopt Machine Learning (ML) methods and tools in a variety of areas of their operations, including data collection, integration, and processing, it is still not clear how these complex and prediction-oriented approaches can be incorporated into the quality standards and frameworks within NSOs or if the frameworks themselves need to be modified. This article focuses on and builds upon two of the quality dimensions proposed in the Quality Framework for Statistical Algorithms (QF4SA): model explainability and accuracy (including uncertainty). The implications of the current methods for explainable ML and uncertainty quantification will be examined in further detail, as well as their possible uses in statistical production, such as continuous model monitoring in intermediate ML classifications and auto-coding phases. This strategy will ensure that human subject-matter experts, who are an essential component of every statistical program, are effectively integrated into the life cycle of ML projects. It will also guarantee to maintain the quality of ML models in production, adhere to the current quality frameworks within NSOs, and ultimately boost confidence and trust in these emerging technologies.

Suggested Citation

  • Saeid Molladavoudi & Wesley Yung, 2023. "Exploring quality dimensions in trustworthy Machine Learning in the context of official statistics: model explainability and uncertainty quantification," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer;Deutsche Statistische Gesellschaft - German Statistical Society, vol. 17(3), pages 223-252, December.
  • Handle: RePEc:spr:astaws:v:17:y:2023:i:3:d:10.1007_s11943-023-00331-z
    DOI: 10.1007/s11943-023-00331-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11943-023-00331-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11943-023-00331-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. F. J. Breidt & G. Claeskens & J. D. Opsomer, 2005. "Model-assisted estimation for complex surveys using penalised splines," Biometrika, Biometrika Trust, vol. 92(4), pages 831-846, December.
    2. Montanari, Giorgio E. & Ranalli, M. Giovanna, 2005. "Nonparametric Model Calibration Estimation in Survey Sampling," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 1429-1442, December.
    3. D. Firth & K. E. Bennett, 1998. "Robust models in probability sampling," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 3-21.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Florian Dumpert & Sebastian Wichert & Thomas Augustin & Nina Storfinger, 2023. "Editorial issue 3 + 4, 2023," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer;Deutsche Statistische Gesellschaft - German Statistical Society, vol. 17(3), pages 191-194, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Carl-Erik Särndal & Imbi Traat & Kaur Lumiste, 2018. "Interaction Between Data Collection And Estimation Phases In Surveys With Nonresponse," Statistics in Transition New Series, Polish Statistical Association, vol. 19(2), pages 183-200, June.
    2. Barranco-Chamorro, I. & Jiménez-Gamero, M.D. & Moreno-Rebollo, J.L. & Muñoz-Pichardo, J.M., 2012. "Case-deletion type diagnostics for calibration estimators in survey sampling," Computational Statistics & Data Analysis, Elsevier, vol. 56(7), pages 2219-2236.
    3. Särndal Carl-Erik & Traat Imbi & Lumiste Kaur, 2018. "Interaction Between Data Collection And Estimation Phases In Surveys With Nonresponse," Statistics in Transition New Series, Statistics Poland, vol. 19(2), pages 183-200, June.
    4. Sumanta Adhya & Tathagata Banerjee & Gaurangadeb Chattopadhyay, 2012. "Inference on finite population categorical response: nonparametric regression-based predictive approach," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 96(1), pages 69-98, January.
    5. Hengfang Wang & Jae Kwang Kim, 2025. "Information projection approach to smoothed propensity score weighting for handling selection bias under missing at random," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 77(1), pages 127-153, February.
    6. Giorgio E. Montanari & M. Giovanna Ranalli, 2006. "A Mixed Model-assisted Regression Estimator that Uses Variables Employed at the Design Stage," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 15(2), pages 139-149, August.
    7. Liu, Chun & Chen, Yang & Li, Shanmin & Sun, Liang & Yang, Mengjie, 2021. "Local political corruption and M&As," China Economic Review, Elsevier, vol. 69(C).
    8. Domingo Morales & María del Mar Rueda & Dolores Esteban, 2018. "Model-Assisted Estimation of Small Area Poverty Measures: An Application within the Valencia Region in Spain," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 138(3), pages 873-900, August.
    9. M. Rueda & I. Sánchez-Borrego & A. Arcos & S. Martínez, 2010. "Model-calibration estimation of the distribution function using nonparametric regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 71(1), pages 33-44, January.
    10. Sanjoy Sinha & Abdus Sattar, 2015. "Inference in semi-parametric spline mixed models for longitudinal data," METRON, Springer;Sapienza Università di Roma, vol. 73(3), pages 377-395, December.
    11. Jae Kwang Kim & Mingue Park, 2010. "Calibration Estimation in Survey Sampling," International Statistical Review, International Statistical Institute, vol. 78(1), pages 21-39, April.
    12. Donald P. Green & Winston Lin & Claudia Gerber, 2018. "Optimal Allocation of Interviews to Baseline and Endline Surveys in Place-Based Randomized Trials and Quasi-Experiments," Evaluation Review, , vol. 42(4), pages 391-422, August.
    13. Jan Pablo Burgard & Ralf Münnich & Martin Rupp, 2019. "A Generalized Calibration Approach Ensuring Coherent Estimates with Small Area Constraints," Research Papers in Economics 2019-10, University of Trier, Department of Economics.
    14. Changbao Wu & Shixiao Zhang, 2019. "Comments on: Deville and Särndal’s calibration: revisiting a 25 years old successful optimization problem," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(4), pages 1082-1086, December.
    15. I. Sánchez-Borrego & A. Arcos & M. Rueda, 2019. "Kernel-based methods for combining information of several frame surveys," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 82(1), pages 71-86, January.
    16. Zhan Liu & Chaofeng Tu & Yingli Pan, 2022. "Model-assisted calibration with SCAD to estimated control for non-probability samples," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(4), pages 849-879, October.
    17. Luis Castro-Martín & María del Mar Rueda & Ramón Ferri-García & César Hernando-Tamayo, 2021. "On the Use of Gradient Boosting Methods to Improve the Estimation with Data Obtained with Self-Selection Procedures," Mathematics, MDPI, vol. 9(23), pages 1-23, November.
    18. Liu Bin & Yu Cindy Long & Price Michael Joseph & Jiang Yan, 2018. "Generalized Method of Moments Estimators for Multiple Treatment Effects Using Observational Data from Complex Surveys," Journal of Official Statistics, Sciendo, vol. 34(3), pages 753-784, September.
    19. Giancarlo Diana & Pier Francesco Perri, 2012. "A calibration-based approach to sensitive data: a simulation study," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(1), pages 53-65, March.
    20. Adhya Sumanta & Banerjee, Tathagata & Chattopadhyay, G., 2007. "Inference on Categorical Survey Response: A Predictive Approach," IIMA Working Papers WP2007-05-07, Indian Institute of Management Ahmedabad, Research and Publication Department.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:astaws:v:17:y:2023:i:3:d:10.1007_s11943-023-00331-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.