Search for a command to run...
Brucellosis remains endemic in the Bayingolin Mongol Autonomous Prefecture (Bayingolin Prefecture), posing a dual threat to public health and the livestock economy. To facilitate the development of effective control measures, we analyzed the epidemiological profile of brucellosis and compared the performance of the Seasonal Autoregressive Integrated Moving Average (SARIMA) model with that of the Extreme Gradient Boosting (XGBoost) model for short-term case prediction. This study provides a scientific basis for early epidemic warning and the formulation of prevention strategies in this region. Monthly reported data on human brucellosis cases in the Bayingolin Prefecture from January 2011 to December 2024 were collected. After cleaning and interpolation, the data were partitioned into training and testing sets. We employed Joinpoint regression to analyze the temporal incidence trend and calculate the annual percentage change (APC). Subsequently, we developed a Seasonal Autoregressive Integrated Moving Average (SARIMA) model and an eXtreme Gradient Boosting (XGBoost) model to forecast monthly case numbers. The predictive performance of both models was evaluated and compared using the mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE), and the coefficient of determination (R²). Between 2011 and 2023, a total of 5,955 human brucellosis cases were reported in the Bayingolin Prefecture, corresponding to an average annual incidence rate of 35.08 per 100,000 population. Overall incidence trends exhibited stage-specific fluctuations: a rapid increase from 2011 to 2014 (APC = 43.59%), followed by a gradual decline from 2014 to 2020 (APC = -11.25%), and a subsequent rebound from 2020 to 2023 (APC = 40.55%). Marked seasonality was also observed, with the highest number of cases recorded from May to August (2,925 cases, 49.12%). Geographically, cases were primarily concentrated in Yanqi Hui Autonomous County (1,588 cases, 26.67%) and Hejing County (1,208 cases, 20.29%). Demographically, male farmers and herders aged 40–60 years constituted the most affected group. The XGBoost model demonstrated superior predictive performance over the SARIMA model. Specifically, it achieved lower errors across all metrics (MAE = 7.68, MAPE = 16.39%, RMSE = 10.65) and a higher coefficient of determination (R² = 0.66) compared to the SARIMA model (MAE = 9.26, MAPE = 21.63%, RMSE = 10.90, R² = 0.61). This result suggests that the XGBoost model possesses an enhanced ability to capture complex nonlinear patterns. XGBoost models exhibit superior accuracy for short-term forecasting, whereas SARIMA models offer a more straightforward structure and are simpler to implement at the grassroots level. A tiered early warning system is advised, wherein XGBoost is deployed in units with ample computational resources to facilitate accurate prevention and control decisions. SARIMA should be employed in grassroots units for regular monitoring, while intervention measures should be intensified in key areas and among high-risk populations. This strategy will optimize the allocation of resources and improve the efficiency of prevention and control efforts.