Search for a command to run...
Accurately quantifying the spatial distribution of topsoil pH and identifying its influencing factors is essential for recognizing potential land-use challenges and promoting the recovery and balance of soil ecological functions. In this study, 1 795 soil samples were collected from the hilly region of southern Sichuan, China, to model and analyze topsoil pH using four base machine learning models: random forest (RF), support vector regression (SVR), extreme gradient boosting (XGB), and neural network (ANN), as well as two ensemble learning approaches: Boosting and Stacking. Model performance was assessed and compared, and Shapley additive explanations (SHAP) were applied to interpret the contribution and interaction of environmental predictors. The results showed that ensemble models achieved higher predictive accuracy than individual base learners, with the Boosting model yielding the best performance (<i>R</i><sup>2</sup>=0.862). All six models demonstrated consistent spatial prediction trends, though a slight compression in value range was observed between predicted and measured pH values. Soil pH across the study area displayed a spatially stratified pattern, generally decreasing from north to south. The four most influential factors were TK, BD, SOC, and annual precipitation. Partial dependence analysis indicated that soil pH increased significantly when TK ranged from 16.25 to 17.34 g·kg<sup>-1</sup> but decreased once TK exceeded 17.83 g·kg<sup>-1</sup>. SOC exhibited a negative effect on soil pH, particularly when SOC content was greater than 8.25 g·kg<sup>-1</sup>. Moreover, interaction analysis revealed heterogeneity in the synergistic effects among various factors. These findings highlight the potential of interpretable ensemble learning methods for modeling soil properties and provide theoretical support for developing targeted strategies to regulate soil pH. They also offer a scientific basis for improving soil health resilience and advancing sustainable soil ecological management in complex agricultural landscapes.