Quantification of Continuous Flood Hazard using Random Forrest Classification and Flood Insurance Claims at Large Spatial Scales: A Pilot Study in Southeast Texas

Prepared by:

William Mobley, Antonia Sebastian, Russell Blessing, Wesley E. Highfield, Laura Stearns, and Samuel D. Brody

Prepared for:

Natural Hazards and Earth System Sciences

Publication Date:

October 28, 2020

Pre-disaster planning and mitigation necessitates detailed spatial information about flood hazards and their associated risks. In the U.S., the FEMA Special Flood Hazard Area (SFHA) provides important information about areas subject to flooding during the 1% riverine or coastal event. The binary nature of flood hazard maps obscures the distribution of property risk inside of the SFHA and the residual risk outside of the SFHA, which can undermine mitigation efforts. Machine-learning techniques provide an alternative approach to estimating flood hazards across large spatial scales at low computational expense. This study presents a pilot study for the Texas Gulf Coast Region using Random Forest Classification to predict flood probability across a 30,523 km2 area. Using a record of National Flood Insurance Program (NFIP) claims dating back to 1976 and high resolution geospatial data, we generate a continuous flood hazard map for twelve USGS HUC-8 watersheds. Results indicate that the Random Forest model predicts flooding with a high sensitivity (AUC 0.895), especially compared to the existing FEMA regulatory floodplain. Our model identifies 649,000 structures with at least a 1% annual chance of flooding, roughly three times more than are currently identified by FEMA as flood prone.