Search for a command to run...
Abstract The digitization of histological slides into high-resolution Whole Slide Images (WSIs) has revolutionized pathology workflows, providing unparalleled insights into tissue morphology and playing a vital role in advancing cancer diagnostics and treatment. Tasks that were once time-consuming and subjective can now be automated and enhanced with machine learning (ML) and artificial intelligence (AI) solutions. These include image analysis systems capable of extracting diagnostic insights such as tumor classification, subclassification, and grading, aiming to improve accuracy, reproducibility, and efficiency in histopathology workflows. To lower the technical barrier to researchers in accessing and utilizing advanced ML, we developed a comprehensive solution for classifying WSIs by disease or tumor subtype, based on the morphological characteristics of the tissue. Building on the NCI Cancer Research Data Commons’ (CRDC) cloud infrastructure, we take advantage of hosted data from the Imaging Data Commons (IDC) and Human Tumor Atlas Network (HTAN) paired with the computational resources available in the Cancer Genomics Cloud (CGC), powered by Velsera and funded by the NCI, to host a reproducible WSI ML solution. This solution includes data preparation and preprocessing, model training, evaluation, and predictions, within an interactive analysis environment. The analysis harnesses the computational capabilities of the CGC platform to efficiently process large-scale datasets. To facilitate data preparation, we integrated tools previously developed on the CGC to extract regions of interest (ROIs) from WSIs, enabling the creation of expanded datasets by generating tumor tissue patches for model training. To further enhance classification performance, a consensus model approach was employed—combining predictions from multiple models to improve robustness and accuracy. This work demonstrates the power of coupling AI-ready harmonized imaging data, available from within the CRDC, with advanced ML techniques, and showcases an end-to-end ML workflow on the CGC. By hosting the solution on an accessible and interactive system, users will be able to seamlessly apply the analysis to their own datasets, gaining meaningful insights with minimal technical barriers. The combination of best practices in data preparation, model development, and consensus-driven accuracy improvement makes ML for histopathology more accessible to users, enabling further research and deeper insights into cancer etiology and diagnostics. Citation Format: Jovana Babić, Nevena Nikolić, Milan Kovačević, Rowan Beck, Tariq Khoyratty, Zelia Worman. Cloud-based machine learning for enhanced tumor classification in cancer genomics: an end-to-end solution for whole slide imaging data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2025; Part 1 (Regular Abstracts); 2025 Apr 25-30; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2025;85(8_Suppl_1):Abstract nr 7434.
Published in: Cancer Research
Volume 85, Issue 8_Supplement_1, pp. 7434-7434