Search for a command to run...
This study presents a national-scale machine learning framework for analysing and forecasting compound flood risk across the United Kingdom. The research integrates daily river flow, catchment rainfall, and sea level observations from 1,311 National River Flow Archive (NRFA) gauging stations covering the period 2010–2024, producing a unified dataset of approximately 6.7 million station–day observations. Compound flooding, defined as the interaction between elevated river discharge and high coastal sea levels, represents a major but often underrepresented flood hazard in UK risk assessment frameworks. A comprehensive feature engineering pipeline was developed to capture hydrological dynamics, including lagged flow conditions, rolling rainfall statistics, antecedent wetness indicators, seasonal encoding, and a compound flood indicator capturing joint river–coastal exceedance. Multiple machine learning classifiers were evaluated, including Logistic Regression, Random Forest, XGBoost, and HistGradientBoosting. The HistGradientBoosting classifier achieved the best performance with a macro-average F1 score of 0.9417, test accuracy of 96.63%, and macro AUC of 0.9969, demonstrating strong predictive capability across three flood risk classes (Low, Medium, High). Model interpretability was assessed using SHAP (SHapley Additive exPlanations), revealing that current river flow magnitude and long-term mean flow are the dominant predictors of flood risk, while the compound flooding indicator provides additional explanatory power for estuarine and coastal catchments. A rolling-origin backtest across evaluation years confirmed robust temporal generalisation of the model. Using the trained model, a probabilistic 10-year forecast (2026–2035) of UK flood risk was generated through synthetic flow simulations and bootstrap uncertainty estimation. Results suggest a gradual increase in medium-risk flood conditions across UK catchments, while the proportion of extreme high-risk events remains relatively stable. These findings provide an interpretable and computationally efficient framework for national-scale flood risk monitoring, with potential applications in emergency management, infrastructure resilience planning, insurance risk assessment, and climate adaptation policy. All data sources used in this study are publicly available, including river flow and rainfall observations from the National River Flow Archive (NRFA) and sea level measurements from the Permanent Service for Mean Sea Level (PSMSL). The work demonstrates the potential of modern machine learning methods to complement traditional hydrological modelling approaches in large-scale environmental risk analysis.