IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/126685.html
   My bibliography  Save this paper

Using Random Forest Machine Learning to Identify Homes at High Risk from Wildfires in California Counties

Author

Listed:
  • Schmidt, James

Abstract

Wildfires driven by extreme winds, such as the Camp Fire in 2018 and the Eaton and Palisades fires in 2025, account for a large share of structure losses due to wildfires in California. Because these types of events are relatively rare, their risks are difficult to estimate using conventional simulation techniques. This study explores the use of the Random Forest machine learning algorithm as an alternative method for estimating wildfire risk to structures. Environmental variables are estimated for 57,000 structures destroyed in wildfires in California and for 6.2 million unburned structures with the potential for wildfire exposure. A Random Forest model, trained on both the burned and unburned structures, identifies which variables are most effective in distinguishing between the two and which unburned structures belong in the High-Risk category. The six environmental variables found to be the most important in identifying High-Risk structures are: · the annual Red Flag Warning hours (RFW) · the average Energy Release Component (ERC) · the Wildland Urban Interface Zone (WUI) · the Normalized Difference Vegetation Index (NDVI) · the annual number of downslope wind events (DW) · the proportion of sustained winds of 20 mph or greater on high fire danger days (SW20) By adjusting the maximum tree-depth parameter, the Random Forest model is calibrated to produce a state-wide percentage of High-Risk structures of 12% in order to match estimates by the California Department of Insurance (CDI). The CDI estimates are based on a weighted average of insurance industry risk models. Although the Random Forest model matches the CDI estimates for the percentage of High-Risk structures at the state level, the percentage by county differs significantly from the CDI numbers. The largest reductions in the percentage of High-Risk structures occur in the Central Sierra counties of Tuolumne and Mariposa ( -48% and -34% respectively). The largest increases occur in Mono County in the Eastern Sierras (+53%) and Ventura County in Southern California (+42%). Wind characteristics appear to be the primary reason for the differences in county risk ratings. Counties with fewer Red Flag Warning hours, fewer downslope wind days, and a smaller proportion of winds above 20 mph tend to have a smaller percentage of High-Risk structures than estimated by the CDI.

Suggested Citation

  • Schmidt, James, 2025. "Using Random Forest Machine Learning to Identify Homes at High Risk from Wildfires in California Counties," MPRA Paper 126685, University Library of Munich, Germany.
  • Handle: RePEc:pra:mprapa:126685
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/126685/1/MPRA_paper_126685.pdf
    File Function: original version
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ;

    JEL classification:

    • D81 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Criteria for Decision-Making under Risk and Uncertainty
    • R23 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - Household Analysis - - - Regional Migration; Regional Labor Markets; Population
    • Y1 - Miscellaneous Categories - - Data: Tables and Charts

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:126685. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.