Hybrid Ligand-and Structure-Based Drug Design Invoked through Square Attention in Protein-Ligand Transformer

20260 citationsJournal Articlegreen Open Access

Authors

Yanhui Lu · Japan Tobacco (United States)

Yasunori Ohara · Japan Tobacco (United States)

Hayato Kunugi · Japan Tobacco (United States)

Masaru Tateno · Japan Tobacco (United States)

Abstract

Computer-aided molecular design based on evaluation of protein-ligand (PL) binding affinity is important for accelerating drug discovery. Deep-learning model for natural language processing (NLP) involving attention mechanism (i.e., Transformer) is expected to improve the PL-affinity evaluation along with clarifying amino acid residues and ligand atoms that are responsible for molecular interactions. However, the original attention (i.e., scaled dot-product with row-wise Softmax normalization) is an adapted method for relationships between homogeneous inputs (e.g. two sentences in the translation tasks), and thus is not suitable to analyze heterogeneity found in input data such as protein (~10 2-3 residues) and ligand (~10 1-2 atoms). In fact, due to the significant difference between the numbers of protein residues and ligand atoms, gaps in the scale of attention weights emerge after the normalization, which thus affects the performance. Herein, we report PLTransformer, which involves our novel attention mechanism with SoftMax normalization exerting on both directions of protein residues and ligand atoms in a more equivalent manner, for correcting the imbalance by a tandem of attention blocks with swapped inputs. As a result, our PLTransfomer shows not only better performances on the affinity prediction, but also great abilities to identify both of binding pockets including allosteric sites in protein, and important scaffolds and substructures of ligands, without employing three-dimensional (3D) structures of PL complexes (i.e., a ligand-based drug design (LBDD) framework). Thus, the possible PL (residue-ligand atom) contacts provided by PLTransformer assist us to select PL binding modes obtained by docking simulations, thereby resulting in more definite identification of the appropriate binding mode such as to maximum the performance of PLTransformer (i.e., structure-based drug design (SBDD) framework).

Topics & Keywords

Computational Drug Discovery Methods Protein Structure and Dynamics vaccines and immunoinformatics approaches

UN Sustainable Development Goals

Quality Education

Publication Details

Published in: ChemRxiv

DOI: 10.26434/chemrxiv.10002099/v1

Field-Weighted Citation Impact: 0.00