Search for a command to run...
Neprilysin (NEP) is a zinc-dependent metallopeptidase, considered a key therapeutic target in heart failure management. Efficient identification of potent NEP inhibitors remains a challenge in drug discovery. The aim of this study was to develop a quantitative structure–activity relationship (QSAR) model using 2D Mordred molecular descriptors and Random Forest algorithms to predict the inhibitory potency (pIC50) of drug candidates. A curated dataset of compounds with experimentally determined IC₅₀ values (in nM) against NEP was preprocessed and converted to pIC50. Mordred was used to calculate 2D molecular descriptors, and descriptors with missing values were excluded. The dataset was split into training, internal validation, and external test sets. A Random Forest regression model was trained using 500 estimators, and model performance was evaluated using R2, root mean square error (RMSE), mean absolute error (MAE), and concordance correlation coefficient (CCC), while a binary classification model was also constructed. Feature importance, residual analysis, and chemical space visualization were conducted to assess model interpretability and reliability. The regression model demonstrated moderate to strong predictive performance, with R2 of 0.286, RMSE of 0.949, MAE of 0.723, and CCC of 0.532 in the internal validation. External validation showed improved generalization, with R2=0.659, RMSE=0.858, MAE=0.630, and CCC=0.763. Binary classification revealed an accuracy of 0.953, precision of 1.000, recall of 0.943, and an F1-score of 0.971, indicating strong discriminative ability in classifying inhibitory versus non-inhibitory compounds. Top contributing descriptors included ATSC2p (feature importance=0.0505), GATS2p (0.0408), and SaasC (0.0317). Principal component analysis (PCA) and Williams plots confirmed that test compounds lie within the model’s applicability domain, with no major outliers in leverage or residual distribution. The developed Random Forest-based QSAR model demonstrates strong predictive power and interpretability for identifying NEP inhibitors. This study provides a valuable tool for virtual screening and highlights the relevance of 2D structural features in governing NEP inhibitory activity. It is the first dedicated QSAR analysis of neprilysin inhibition using Mordred descriptors with rigorous internal and external validation.