化学
鉴定(生物学)
计算机科学
序列(生物学)
图形
药物发现
化学
人工智能
蛋白质配体
数据挖掘
计算生物学
结合亲和力
机器学习
虚拟筛选
血浆蛋白结合
蛋白质测序
结合位点
蛋白质-蛋白质相互作用
配体(生物化学)
分子结合
药物靶点
合成数据
装订袋
作者
Yi He,Minghao Liu,Hao Wang,Lu Han,Weiwei Han
标识
DOI:10.1021/acs.jmedchem.5c03431
摘要
Accurate prediction of drug-target binding affinity (DTA) remains a central challenge in drug discovery due to the need to integrate heterogeneous sequence, structural, and physicochemical information. Here, we propose PLMCA, a multimodal protein-ligand cross-attention framework that unifies protein sequence embeddings from two protein language models, three-dimensional geometric features, physicochemical descriptors, and ligand molecular graph representations within a single architecture. PLMCA further incorporates experimental assay conditions from the ChEMBL database as auxiliary inputs to mitigate batch effects and reduce measurement noise. On the PDBbind21 data set, PLMCA performs competitively or outperforms state-of-the-art methods under random, unseen-ligand, and unseen-protein splits for Kd and Ki prediction. On the ChEMBL_mini data set, PLMCA achieves R2 values of 0.531, 0.635, and 0.519 for IC50, Kd, and Ki prediction, respectively. In addition, PLMCA demonstrates robust protein binding pocket prediction performance, achieving an AUPR of up to 0.655 under the unseen-protein setting.
科研通智能强力驱动
Strongly Powered by AbleSci AI