Search for a command to run...
Although protein-RNA interactions are crucial for many biological processes, predicting their binding free energies (Δ<i>G</i>) is a challenging task due to limited available experimental data and the complexity of these interactions. To address this issue, we developed a machine learning-based model designed to predict energy-based scores for protein-RNA complexes, called PANTHER Score. By applying a local-to-global approach, we proposed a methodology further subdivided into five steps: (1) We derived 87,117 pairwise local interaction energies from 331,744 MD-derived interactions across 46 curated protein-RNA complexes; (2) we trained ML models on pairwise interaction features to predict local interaction energies without performing MD simulations; (3) we integrated predicted local interaction energies using a local-to-global methodology, to compute model-specific PANTHER Score; (4) we evaluate model-specific PANTHER Score on an independent test set of seven complexes; and (5) we validated and selected the optimal model using an external stress set of 110 complexes with experimental Δ<i>G</i> values for implementation in the PANTHER Scoring pipeline. Among the regression models developed, Random Forest Regression exhibited the highest predictive performance as a model-specific PANTHER Score, achieveing a Pearson correlation (<i>r</i>) of 0.80 and MAE of 1.79 kcal/mol on the test set. It maintained strong predictive capabilities on the stress set (<i>r</i> = 0.64, MAE = 1.63 kcal/mol). Benchmarking against existing tools on the stress test set, the PANTHER Score demonstrated superior accuracy and reliability. This study highlights the effectiveness of MD and machine learning in addressing data limitations through innovative strategies, positioning the PANTHER Score as a robust tool for predicting protein-RNA binding affinities in biomolecular research, drug discovery and mainly in RNA-therapeutics.