IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v78y2022i1p324-336.html
   My bibliography  Save this article

Simultaneous spatial smoothing and outlier detection using penalized regression, with application to childhood obesity surveillance from electronic health records

Author

Listed:
  • Young‐Geun Choi
  • Lawrence P. Hanrahan
  • Derek Norton
  • Ying‐Qi Zhao

Abstract

Electronic health records (EHRs) have become a platform for data‐driven granular‐level surveillance in recent years. In this paper, we make use of EHRs for early prevention of childhood obesity. The proposed method simultaneously provides smooth disease mapping and outlier information for obesity prevalence that are useful for raising public awareness and facilitating targeted intervention. More precisely, we consider a penalized multilevel generalized linear model. We decompose regional contribution into smooth and sparse signals, which are automatically identified by a combination of fusion and sparse penalties imposed on the likelihood function. In addition, we weigh the proposed likelihood to account for the missingness and potential nonrepresentativeness arising from the EHR data. We develop a novel alternating minimization algorithm, which is computationally efficient, easy to implement, and guarantees convergence. Simulation studies demonstrate superior performance of the proposed method. Finally, we apply our method to the University of Wisconsin Population Health Information Exchange database.

Suggested Citation

  • Young‐Geun Choi & Lawrence P. Hanrahan & Derek Norton & Ying‐Qi Zhao, 2022. "Simultaneous spatial smoothing and outlier detection using penalized regression, with application to childhood obesity surveillance from electronic health records," Biometrics, The International Biometric Society, vol. 78(1), pages 324-336, March.
  • Handle: RePEc:bla:biomet:v:78:y:2022:i:1:p:324-336
    DOI: 10.1111/biom.13404
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13404
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13404?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    2. Friedman, D.J. & Parrish, R.G. & Ross, D.A., 2013. "Electronic health records and US public health: Current realities and future promise," American Journal of Public Health, American Public Health Association, vol. 103(9), pages 1560-1567.
    3. She, Yiyuan & Owen, Art B., 2011. "Outlier Detection Using Nonconvex Penalized Regression," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 626-639.
    4. Julian Besag & Jeremy York & Annie Mollié, 1991. "Bayesian image restoration, with two applications in spatial statistics," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 43(1), pages 1-20, March.
    5. C. P. Farrington & N. J. Andrews & A. D. Beale & M. A. Catchpole, 1996. "A Statistical Algorithm for the Early Detection of Outbreaks of Infectious Disease," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 159(3), pages 547-563, May.
    6. Yingqi Zhao & Donglin Zeng & Amy H. Herring & Amy Ising & Anna Waller & David Richardson & Michael R. Kosorok, 2011. "Detecting Disease Outbreaks Using Local Spatiotemporal Methods," Biometrics, The International Biometric Society, vol. 67(4), pages 1508-1517, December.
    7. Lee, Sangin & Kwon, Sunghoon & Kim, Yongdai, 2016. "A modified local quadratic approximation algorithm for penalized optimization problems," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 275-286.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ryan J. Parker & Brian J. Reich & Jo Eidsvik, 2016. "A Fused Lasso Approach to Nonstationary Spatial Covariance Estimation," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 21(3), pages 569-587, September.
    2. Hosik Choi & Eunjung Song & Seung-sik Hwang & Woojoo Lee, 2018. "A modified generalized lasso algorithm to detect local spatial clusters for count data," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 102(4), pages 537-563, October.
    3. Howard D. Bondell & Brian J. Reich, 2009. "Simultaneous Factor Selection and Collapsing Levels in ANOVA," Biometrics, The International Biometric Society, vol. 65(1), pages 169-177, March.
    4. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    5. Margherita Giuzio, 2017. "Genetic algorithm versus classical methods in sparse index tracking," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 40(1), pages 243-256, November.
    6. Vinícius Diniz Mayrink & Renato Valladares Panaro & Marcelo Azevedo Costa, 2021. "Structural equation modeling with time dependence: an application comparing Brazilian energy distributors," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 105(2), pages 353-383, June.
    7. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    8. Doyo G Enki & Paul H Garthwaite & C Paddy Farrington & Angela Noufaily & Nick J Andrews & Andre Charlett, 2016. "Comparison of Statistical Algorithms for the Detection of Infectious Disease Outbreaks in Large Multiple Surveillance Systems," PLOS ONE, Public Library of Science, vol. 11(8), pages 1-25, August.
    9. Katherine Wilson & Jon Wakefield, 2022. "A probabilistic model for analyzing summary birth history data," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 47(11), pages 291-344.
    10. Thomas C. McHale & Claudia M. Romero-Vivas & Claudio Fronterre & Pedro Arango-Padilla & Naomi R. Waterlow & Chad D. Nix & Andrew K. Falconar & Jorge Cano, 2019. "Spatiotemporal Heterogeneity in the Distribution of Chikungunya and Zika Virus Case Incidences during their 2014 to 2016 Epidemics in Barranquilla, Colombia," IJERPH, MDPI, vol. 16(10), pages 1-21, May.
    11. Peter Congdon, 2010. "A multiple indicator, multiple cause method for representing social capital with an application to psychological distress," Journal of Geographical Systems, Springer, vol. 12(1), pages 1-23, March.
    12. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    13. Renato Assunção & Carl Schmertmann & Joseph Potter & Suzana Cavenaghi, 2005. "Empirical bayes estimation of demographic schedules for small areas," Demography, Springer;Population Association of America (PAA), vol. 42(3), pages 537-558, August.
    14. Peter Congdon, 2014. "Estimating life expectancies for US small areas: a regression framework," Journal of Geographical Systems, Springer, vol. 16(1), pages 1-18, January.
    15. Yize Zhao & Matthias Chung & Brent A. Johnson & Carlos S. Moreno & Qi Long, 2016. "Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1427-1439, October.
    16. Jie Jian & Peijun Sang & Mu Zhu, 2024. "Two Gaussian Regularization Methods for Time-Varying Networks," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 29(4), pages 853-873, December.
    17. Eibich, Peter & Ziebarth, Nicolas, 2014. "Examining the Structure of Spatial Health Effects in Germany Using Hierarchical Bayes Models," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 49, pages 305-320.
    18. Chen, Yewen & Chang, Xiaohui & Luo, Fangzhi & Huang, Hui, 2023. "Additive dynamic models for correcting numerical model outputs," Computational Statistics & Data Analysis, Elsevier, vol. 187(C).
    19. Li, Houjian & Tang, Mengqian & Cao, Andi & Guo, Lili, 2024. "How to reduce firm pollution discharges: Does political leaders' gender matter?," Technological Forecasting and Social Change, Elsevier, vol. 204(C).
    20. Dani Gamerman & Ajax R. B. Moreira, 2015. "Multivariate Spatial Regression Models," Discussion Papers 0116, Instituto de Pesquisa Econômica Aplicada - IPEA.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:78:y:2022:i:1:p:324-336. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.