化学
药物数据库
计算机科学
人工智能
特征(语言学)
嵌入
机器学习
管道(软件)
药物发现
药物重新定位
深度学习
化学信息学
过度拟合
特征学习
药物开发
生物信息学
人工神经网络
生物信息学
药品
化学
生物
哲学
基因
药理学
程序设计语言
生物化学
语言学
作者
Fachun Wan,Jianyang Zeng
摘要
Abstract Accurately identifying compound-protein interactions in silico can deepen our understanding of the mechanisms of drug action and significantly facilitate the drug discovery and development process. Traditional similarity-based computational models for compound-protein interaction prediction rarely exploit the latent features from current available large-scale unlabelled compound and protein data, and often limit their usage on relatively small-scale datasets. We propose a new scheme that combines feature embedding (a technique of representation learning) with deep learning for predicting compound-protein interactions. Our method automatically learns the low-dimensional implicit but expressive features for compounds and proteins from the massive amount of unlabelled data. Combining effective feature embedding with powerful deep learning techniques, our method provides a general computational pipeline for accurate compound-protein interaction prediction, even when the interaction knowledge of compounds and proteins is entirely unknown. Evaluations on current large-scale databases of the measured compound-protein affinities, such as ChEMBL and BindingDB, as well as known drug-target interactions from DrugBank have demonstrated the superior prediction performance of our method, and suggested that it can offer a useful tool for drug development and drug repositioning.
科研通智能强力驱动
Strongly Powered by AbleSci AI