Search for a command to run...
Machine learning (ML) is a powerful tool for inferring chemometric properties from Raman spectra, expanding the information extractable from high-dimensional spectral data. A growing application is the estimation of soil organic carbon (SOC), where ML models relate overlapping Raman and fluorescence features to chemical composition. However, these models typically lack calibrated, prediction-level uncertainty estimates that limit their utility in decision-critical contexts. We present a framework for quantifying predictive uncertainty in SOC estimation from Raman spectra using Shifted Excitation Raman Difference Spectroscopy (SERDS). The approach employs conformal prediction (CP) to generate statistically valid prediction intervals using a held-out calibration data set and is compatible with a variety of uncertainty quantification (UQ) methods. To our knowledge, this is the first unified framework that integrates conformal calibration with multiple UQ strategies for Raman-based SOC estimation, addressing both aleatoric (irreducible) and epistemic (reducible) sources of uncertainty in a field-relevant setting. We assess the framework across several regression models, including Deep Ensembles, Bayesian neural networks, Monte Carlo Dropout, quantile regression, and heteroscedastic Gaussian models. All methods, when conformalized, produced well-calibrated uncertainty estimates with narrow prediction intervals, achieving reliable empirical coverage across confidence levels. Ablation studies revealed that many UQ techniques were poorly calibrated without conformalization. Our findings indicate that uncertainty in this task is predominantly aleatoric in nature, suggesting that improvements in predictive performance will depend more on improving spectral quality and preprocessing than on model complexity. This framework provides a practical, generalizable solution for generating trustworthy, calibrated, sample-specific uncertainty estimates in Raman-based chemometric analyses.