机器学习
可转让性
化学空间
计算机科学
稳健性(进化)
人工智能
领域知识
图形
知识图
多任务学习
数据挖掘
化学
理论计算机科学
药物发现
任务(项目管理)
生物化学
管理
罗伊特
经济
基因
作者
Xixi Yang,Yanjing Duan,Zhixiang Cheng,Kun Li,Yuansheng Liu,Xiangzheng Fu,Dongsheng Cao
标识
DOI:10.1021/acs.jmedchem.4c02193
摘要
Molecular property prediction with deep learning often employs self-supervised learning techniques to learn common knowledge through masked atom prediction. However, the common knowledge gained by masked atom prediction dramatically differs from the graph-level optimization objective of downstream tasks, which results in suboptimal problems. Particularly for properties with limited data, the failure to consider domain knowledge results in a direct search in an immense common space, rendering it infeasible to identify the global optimum. To address this, we propose MPCD, which enhances pretraining transferability by aligning the optimization objectives between pretraining and fine-tuning with domain knowledge. MPCD also leverages multitask learning to improve data utilization and model robustness. Technically, MPCD employs a relation-aware self-attention mechanism to capture molecules' local and global structures comprehensively. Extensive validation demonstrates that MPCD outperforms state-of-the-art methods for absorption, distribution, metabolism, excretion, and toxicity (ADMET) and physicochemical prediction across various data sizes.
科研通智能强力驱动
Strongly Powered by AbleSci AI