Search for a command to run...
ABSTRACT Detecting causal language in scientific literature is critical for understanding how research fields frame evidence and inform interventions and policies, yet existing approaches commonly rely on manual annotation. The objective of this study was to evaluate four classifiers for detecting causal language and to apply the best-performing model to assess trends in microbiome research. Microbiome research, with its rapidly expanding observational literature, provides a relevant case study. We extracted Term Frequency–Inverse Document Frequency (TF-IDF) features from the last three sentences of available publication abstracts and trained four classifiers (L1- and L2-regularized logistic regression, Random Forest, and eXtreme Gradient Boosting) to detect causal language. A total of 475 sentences, as determined pragmatically based on annotation feasibility and observed stabilization of model performance, were manually labeled as causal or non-causal following established guidelines for systematic evaluation of causal language in observational health research. Of these, 75% of sentences were used for training and 25% for testing. L1-regularized logistic regression achieved the highest performance (accuracy 76%, F1 72%, prevalence detection accuracy 95%, sensitivity 72%, and specificity 80%) and was applied to 20,022 human gut microbiome abstracts published between 2015 and 2025 grouped into 20 thematic topics using structural topic modeling. Predicted causal language prevalence declined from 52% to 44% between 2015 and 2018, then rose to 51% by 2025, with notable variation across topics (range: 43.1–53.3%). Temporal trends differed across subfields, with increases in Metabolic disorders, Fecal microbiota transplantation, and decreases in Biomarkers and prediction, Antibiotic resistance, and In vitro fermentation . Analysis of influential words confirmed that causal meaning is primarily driven by verbs and modifiers lexically signaling change or intervention. The proposed approach for identifying causal claims in scientific abstracts enables systematic and automated, scalable assessment of how evidence is framed. Its application to the microbiome field highlighted heterogeneity in the reporting of causal relationships and informing the interpretation of microbiome findings for clinical and public health decision-making.