THE INTRINSIC COMPLEXITY OF ADMET PREDICTION

20260 citationsJournal Articlegreen Open Access

Authors

Xichen Zhang · Quanta Technology (United States)

Xin Chen · Quanta Technology (United States)

Abstract

Different ADMET endpoints involve distinct mechanisms and may therefore require different modeling strategies. We benchmarked four architectures combining two molecular representations (Morgan fingerprints and Chemprop v2 graph networks), with and without physicochemical descriptors via late fusion, on LogD 7.4 regression, BBB permeability classification, and CYP2D6 classification. For LogD 7.4, GNN-based models consistently outperformed fingerprint-based networks on both datasets, while appending physicochemical descriptors yielded only marginal improvements, suggesting that graph structure alone captures most predictive signal for lipophilicity. For BBB classification, all architectures achieved strong performance, with GNN variants offering the best balance between false positive and false negative rates. CYP2D6 remained challenging under class imbalance: GNNs reduced false positive rates but at the cost of substantially higher false negative rates, indicating conservative predictions with limited recall of positives. Cross-dataset evaluation between AstraZeneca and OpenADMET revealed asymmetric transferability, highlighting distribution shift across chemical spaces. These findings suggest that lightweight models suffice for well-separated endpoints such as BBB, whereas target-dependent tasks like CYP2D6 likely require features beyond molecular structure alone.

Topics & Keywords

Computational Drug Discovery Methods Machine Learning in Bioinformatics Machine Learning in Materials Science

Publication Details

Published in: ChemRxiv

DOI: 10.26434/chemrxiv.15001004/v1

Field-Weighted Citation Impact: 0.00

Command Palette

THE INTRINSIC COMPLEXITY OF ADMET PREDICTION

Authors

Abstract

Topics & Keywords

Publication Details