Search for a command to run...
Different ADMET endpoints involve distinct mechanisms and may therefore require different modeling strategies. We benchmarked four architectures combining two molecular representations (Morgan fingerprints and Chemprop v2 graph networks), with and without physicochemical descriptors via late fusion, on LogD 7.4 regression, BBB permeability classification, and CYP2D6 classification. For LogD 7.4, GNN-based models consistently outperformed fingerprint-based networks on both datasets, while appending physicochemical descriptors yielded only marginal improvements, suggesting that graph structure alone captures most predictive signal for lipophilicity. For BBB classification, all architectures achieved strong performance, with GNN variants offering the best balance between false positive and false negative rates. CYP2D6 remained challenging under class imbalance: GNNs reduced false positive rates but at the cost of substantially higher false negative rates, indicating conservative predictions with limited recall of positives. Cross-dataset evaluation between AstraZeneca and OpenADMET revealed asymmetric transferability, highlighting distribution shift across chemical spaces. These findings suggest that lightweight models suffice for well-separated endpoints such as BBB, whereas target-dependent tasks like CYP2D6 likely require features beyond molecular structure alone.