IDEAS home Printed from https://ideas.repec.org/a/wly/riskan/v41y2021i1p37-55.html
   My bibliography  Save this article

Improved Transferability of Data‐Driven Damage Models Through Sample Selection Bias Correction

Author

Listed:
  • Dennis Wagenaar
  • Tiaravanni Hermawan
  • Marc J. C. van den Homberg
  • Jeroen C. J. H. Aerts
  • Heidi Kreibich
  • Hans de Moel
  • Laurens M. Bouwer

Abstract

Damage models for natural hazards are used for decision making on reducing and transferring risk. The damage estimates from these models depend on many variables and their complex sometimes nonlinear relationships with the damage. In recent years, data‐driven modeling techniques have been used to capture those relationships. The available data to build such models are often limited. Therefore, in practice it is usually necessary to transfer models to a different context. In this article, we show that this implies the samples used to build the model are often not fully representative for the situation where they need to be applied on, which leads to a “sample selection bias.” In this article, we enhance data‐driven damage models by applying methods, not previously applied to damage modeling, to correct for this bias before the machine learning (ML) models are trained. We demonstrate this with case studies on flooding in Europe, and typhoon wind damage in the Philippines. Two sample selection bias correction methods from the ML literature are applied and one of these methods is also adjusted to our problem. These three methods are combined with stochastic generation of synthetic damage data. We demonstrate that for both case studies, the sample selection bias correction techniques reduce model errors, especially for the mean bias error this reduction can be larger than 30%. The novel combination with stochastic data generation seems to enhance these techniques. This shows that sample selection bias correction methods are beneficial for damage model transfer.

Suggested Citation

  • Dennis Wagenaar & Tiaravanni Hermawan & Marc J. C. van den Homberg & Jeroen C. J. H. Aerts & Heidi Kreibich & Hans de Moel & Laurens M. Bouwer, 2021. "Improved Transferability of Data‐Driven Damage Models Through Sample Selection Bias Correction," Risk Analysis, John Wiley & Sons, vol. 41(1), pages 37-55, January.
  • Handle: RePEc:wly:riskan:v:41:y:2021:i:1:p:37-55
    DOI: 10.1111/risa.13575
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/risa.13575
    Download Restriction: no

    File URL: https://libkey.io/10.1111/risa.13575?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Andrea E Gaughan & Forrest R Stevens & Catherine Linard & Peng Jia & Andrew J Tatem, 2013. "High Resolution Population Distribution Maps for Southeast Asia in 2010 and 2015," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-11, February.
    2. Helen J Mayfield & Carl S Smith & John H Lowry & Conall H Watson & Michael G Baker & Mike Kama & Eric J Nilles & Colleen L Lau, 2018. "Predictive risk mapping of an environmentally-driven infectious disease using spatial Bayesian networks: A case study of leptospirosis in Fiji," PLOS Neglected Tropical Diseases, Public Library of Science, vol. 12(10), pages 1-16, October.
    3. Roshanak Nateghi & Seth D. Guikema & Steven M. Quiring, 2011. "Comparison and Validation of Statistical Methods for Predicting Power Outage Durations in the Event of Hurricanes," Risk Analysis, John Wiley & Sons, vol. 31(12), pages 1897-1906, December.
    4. Roshanak Nateghi & Seth Guikema & Steven M. Quiring, 2014. "Power Outage Estimation for Tropical Cyclones: Improved Accuracy with Simpler Models," Risk Analysis, John Wiley & Sons, vol. 34(6), pages 1069-1078, June.
    5. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    6. Tina Gerl & Heidi Kreibich & Guillermo Franco & David Marechal & Kai Schröter, 2016. "A Review of Flood Loss Models as Basis for Harmonization and Benchmarking," PLOS ONE, Public Library of Science, vol. 11(7), pages 1-22, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hughes, William & Zhang, Wei & Cerrai, Diego & Bagtzoglou, Amvrossios & Wanik, David & Anagnostou, Emmanouil, 2022. "A Hybrid Physics-Based and Data-Driven Model for Power Distribution System Infrastructure Hardening and Outage Simulation," Reliability Engineering and System Safety, Elsevier, vol. 225(C).
    2. Hossain, Eklas & Roy, Shidhartho & Mohammad, Naeem & Nawar, Nafiu & Dipta, Debopriya Roy, 2021. "Metrics and enhancement strategies for grid resilience and reliability during natural disasters," Applied Energy, Elsevier, vol. 290(C).
    3. Hughes, William & Watson, Peter L. & Cerrai, Diego & Zhang, Xinxuan & Bagtzoglou, Amvrossios & Zhang, Wei & Anagnostou, Emmanouil, 2024. "Assessing grid hardening strategies to improve power system performance during storms using a hybrid mechanistic-machine learning outage prediction model," Reliability Engineering and System Safety, Elsevier, vol. 248(C).
    4. Berk A. Alpay & David Wanik & Peter Watson & Diego Cerrai & Guannan Liang & Emmanouil Anagnostou, 2020. "Dynamic Modeling of Power Outages Caused by Thunderstorms," Forecasting, MDPI, vol. 2(2), pages 1-12, May.
    5. Hughes, William & Zhang, Wei & Bagtzoglou, Amvrossios C. & Wanik, David & Pensado, Osvaldo & Yuan, Hao & Zhang, Jintao, 2021. "Damage modeling framework for resilience hardening strategy for overhead power distribution systems," Reliability Engineering and System Safety, Elsevier, vol. 207(C).
    6. Feifei Yang & Diego Cerrai & Emmanouil N. Anagnostou, 2021. "The Effect of Lead-Time Weather Forecast Uncertainty on Outage Prediction Modeling," Forecasting, MDPI, vol. 3(3), pages 1-16, July.
    7. Olukunle O. Owolabi & Deborah A. Sunter, 2022. "Bayesian Optimization and Hierarchical Forecasting of Non-Weather-Related Electric Power Outages," Energies, MDPI, vol. 15(6), pages 1-22, March.
    8. Darima Fotheringham & Michael A. Wiles, 2023. "The effect of implementing chatbot customer service on stock returns: an event study analysis," Journal of the Academy of Marketing Science, Springer, vol. 51(4), pages 802-822, July.
    9. Song, Wei-Ling & Uzmanoglu, Cihan, 2016. "TARP announcement, bank health, and borrowers’ credit risk," Journal of Financial Stability, Elsevier, vol. 22(C), pages 22-32.
    10. Raymundo M. Campos-Vázquez, 2013. "Efectos de los ingresos no reportados en el nivel y tendencia de la pobreza laboral en México," Ensayos Revista de Economia, Universidad Autonoma de Nuevo Leon, Facultad de Economia, vol. 0(2), pages 23-54, November.
    11. Stephen Brown & William Goetzmann & Bing Liang & Christopher Schwarz, 2008. "Mandatory Disclosure and Operational Risk: Evidence from Hedge Fund Registration," Journal of Finance, American Finance Association, vol. 63(6), pages 2785-2815, December.
    12. Paul W. Miller & Barry R. Chiswick, 2002. "Immigrant earnings: Language skills, linguistic concentrations and the business cycle," Journal of Population Economics, Springer;European Society for Population Economics, vol. 15(1), pages 31-57.
    13. Chul‐Woo Kwon & Peter F. Orazem & Daniel M. Otto, 2006. "Off‐farm labor supply responses to permanent and transitory farm income," Agricultural Economics, International Association of Agricultural Economists, vol. 34(1), pages 59-67, January.
    14. Jonathan Gruber & Aaron Yelowitz, 1999. "Public Health Insurance and Private Savings," Journal of Political Economy, University of Chicago Press, vol. 107(6), pages 1249-1274, December.
    15. Jean-Louis Arcand & Linguère M'Baye, 2013. "Braving the waves: the role of time and risk preferences in illegal migration from Senegal," CERDI Working papers halshs-00855937, HAL.
    16. Sandra Müllbacher & Wolfgang Nagl, 2017. "Labour supply in Austria: an assessment of recent developments and the effects of a tax reform," Empirica, Springer;Austrian Institute for Economic Research;Austrian Economic Association, vol. 44(3), pages 465-486, August.
    17. Campbell, Randall C. & Nagel, Gregory L., 2016. "Private information and limitations of Heckman's estimator in banking and corporate finance research," Journal of Empirical Finance, Elsevier, vol. 37(C), pages 186-195.
    18. Leye Li & Louise Yi Lu & Dongyue Wang, 2022. "External labour market competitions and stock price crash risk: evidence from exposures to competitor CEOs’ award‐winning events," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 62(S1), pages 1421-1460, April.
    19. Jože P. Damijan & Mark Knell, 2005. "How Important Is Trade and Foreign Ownership in Closing the Technology Gap? Evidence from Estonia and Slovenia," Review of World Economics (Weltwirtschaftliches Archiv), Springer;Institut für Weltwirtschaft (Kiel Institute for the World Economy), vol. 141(2), pages 271-295, July.
    20. Calcagno, R. & Renneboog, L.D.R., 2004. "Capital Structure and Managerial Compensation : The Effects of Renumeration Seniority," Discussion Paper 2004-120, Tilburg University, Center for Economic Research.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:riskan:v:41:y:2021:i:1:p:37-55. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://doi.org/10.1111/(ISSN)1539-6924 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.