Search for a command to run...
Rapid and non-invasive analysis of food products is essential in the agrifood sector for ensuring quality, safety and authenticity. In this context, Volatile Organic Compound (VOC) analysis plays a key role, and direct injection mass spectrometry, Proton Transfer Reaction Mass Spectrometry (PTR-ToF-MS) in particular, offers an optimal tool due to its speed and high sensitivity. The resulting datasets from these analyses are typically modeled using classification, regression, and peak selection methods. In these tasks, gradient boosting methods, and XGBoost in particular, have demonstrated outstanding performance, often surpassing classical machine learning techniques and deep learning approaches. In this work, we investigate the applicability of XGBoost to PTR-ToF-MS datasets of food VOCs in detail. We show that XGBoost requires careful (and time-consuming) optimization to achieve competitive results in this specific domain. Our results indicate that the performance of XGBoost on food products is better in classification than in other analysis tasks, and is comparable on regression and peak selection to that of other state-of-the-art methods, when all methods are appropriately tuned. Given the inherent difficulty of modeling small and noisy real world datasets, our work highlights the importance of carefully evaluating methods within each specific domain, rather than extrapolating their performance as a given. • We evaluate the use of gradient boosting, in particular XGBoost, in PTR-Tof-MS datasets from the agrifood domain. • We include classification, regression and peak selection problems in the evaluation. • We analyze the tuning procedure for XGBoost. • We compare the performance with methods often used for our data. • XGBoost shows good performance for classification problems • For regression and peak selection the results are similar to other methods.
Published in: Chemometrics and Intelligent Laboratory Systems
Volume 273, pp. 105702-105702