水准点(测量)
人工智能
卷积神经网络
符号
代表(政治)
可视化
计算机科学
UniProt公司
模式识别(心理学)
机器学习
数学
生物
政治
地理
法学
基因
大地测量学
算术
生物化学
政治学
作者
Shuangjia Zheng,Yongjian Li,Sheng Chen,Jun Xu,Yuedong Yang
标识
DOI:10.1038/s42256-020-0152-y
摘要
Identifying novel drug–protein interactions is crucial for drug discovery. For this purpose, many machine learning-based methods have been developed based on drug descriptors and one-dimensional protein sequences. However, protein sequences cannot accurately reflect the interactions in three-dimensional space. However, direct input of three-dimensional structure is of low efficiency due to the sparse three-dimensional matrix, and is also prevented by the limited number of co-crystal structures available for training. Here we propose an end-to-end deep learning framework to predict the interactions by representing proteins with a two-dimensional distance map from monomer structures (Image) and drugs with molecular linear notation (String), following the visual question answering mode. For efficient training of the system, we introduce a dynamic attentive convolutional neural network to learn fixed-size representations from the variable-length distance maps and a self-attentional sequential model to automatically extract semantic features from the linear notations. Extensive experiments demonstrate that our model obtains competitive performance against state-of-the-art baselines on the directory of useful decoys, enhanced (DUD-E), human and BindingDB benchmark datasets. Further attention visualization provides biological interpretation to depict highlighted regions of both protein and drug molecules. When predicting the interaction of proteins with potential drugs, the protein can be encoded as its one-dimensional sequence or a three-dimensional structure, which can capture more relevant features of the protein, but also makes the task to predict the interactions harder. A new method predicts these interactions using a two-dimensional distance matrix representation of a protein, which can be processed like a two-dimensional image, striking a balance between the data being simple to process and rich in relevant structures.
科研通智能强力驱动
Strongly Powered by AbleSci AI