Search for a command to run...
This study investigates the application of machine learning (ML) to predict Standard Penetration Test (SPT-N) values from Cone Penetration Test (CPT) data, aiming to enhance the efficiency of geotechnical site characterization. Utilizing a dataset from a 16-mi section of the San Diego Freeway (I-405) expansion project, comprising approximately 45,000 records with CPT features (depth, cone tip resistance, sleeve resistance, pore water pressure, and soil classification), seven ML models were developed and compared: Linear Regression, Decision Tree, Random Forest, Gradient Boosting, Support Vector Machine (SVM), Naive Bayes, and Lasso Regression. The Random Forest model outperformed others, achieving a Mean Squared Error (MSE) of 30.47 and an R² of 94.36%, demonstrating its ability to capture complex, non-linear relationships in geotechnical data. The tuned Gradient Boosting model followed with an MSE of 98.57 and R² of 81.57%. Simpler models like Linear Regression (MSE: 251.11, R²: 52.39%) and Lasso Regression (MSE: 251.11, R²: 52.40%) were less effective, while Naive Bayes performed poorly (MSE: 577.70, R²: 27.62%). The study highlights the superiority of ensemble methods, particularly Random Forest, for accurate SPT-N prediction, offering a cost-effective alternative to extensive SPT testing. Feature importance analyses underscored the critical roles of depth and cone tip resistance. These findings suggest that ML-driven approaches can revolutionize geotechnical engineering by enabling rapid, reliable subsurface characterization, with potential for further enhancement through additional variables and broader datasets.