Search for a command to run...
Abstract Introduction The complete blood count with differential (CBD) is one of the most commonly performed blood tests worldwide, used in nearly all areas of medicine. Although modern CBD analyzers generate flow-cytometry based single-cell measurements, the resultant CBD markers are limited to coarse summary features, such as total cell counts and average cell sizes. This means, the markers cannotdetect subtle cell population shifts that may signal early-stage pathogenesis. To test this, we evaluate whether AI-based analysis of the raw single-cell data underlying the CBD can be used to develop novel, clinically prognostic biomarkers, across patient settings. Method We developed two complementary methods for biomarker discovery using CBD tests and evaluated them with longitudinal data from an academic medical center. To create interpretable biomarkers, we clustered cells into physiologically meaningful sub-populations and performed robust statistical summarization. In tandem, self-supervised autoencoders were developed to extract novel non-linear markers. We evaluated the utility of these clustering (CLS) and autoencoder (AE) markers for patient prognostication across a range of outcomes (mortality, inpatient admission, and future disease development). Results Our study included 242,623 CBD samples from 127,545 patients. Both clustering and embedding approaches successfully generated hundreds of new clinical biomarkers. Many biomarkers showed strong prognostic associations for all-cause mortality, inpatient admission, and development of anemia, cancer, or cardiovascular disease, with associations remaining significant after adjustment for demographics and clinical CBD markers. A large subset of these prognostic markers also showed high novelty – having low correlations to existing CBD markers, while also exhibiting significant correlations with broader physiologic signals, such as inflammatory, hormonal, infectious, and coagulopathic markers. Conclusion Collectively, these results demonstrate how modern AI techniques can allow for deeper phenotyping of routine clinical blood counts, generating novel biomarkers that capture more subtle physiologic signals than what are currently clinically utilized.