代表(政治)
可扩展性
训练集
编码器
分子图
多任务学习
图形
计算机科学
人工智能
机器学习
任务(项目管理)
理论计算机科学
政治学
政治
操作系统
数据库
经济
管理
法学
作者
Jungwoo Kim,Woojae Chang,Hyunjun Ji,InSuk Joung
标识
DOI:10.1021/acs.jcim.4c00772
摘要
We examined pretraining tasks leveraging abundant labeled data to effectively enhance molecular representation learning in downstream tasks, specifically emphasizing graph transformers to improve the prediction of ADMET properties. Our investigation revealed limitations in previous pretraining tasks and identified more meaningful training targets, ranging from 2D molecular descriptors to extensive quantum chemistry simulations. These data were seamlessly integrated into supervised pretraining tasks. The implementation of our pretraining strategy and multitask learning outperforms conventional methods, achieving state-of-the-art outcomes in 7 out of 22 ADMET tasks within the Therapeutics Data Commons by utilizing a shared encoder across all tasks. Our approach underscores the effectiveness of learning molecular representations and highlights the potential for scalability when leveraging extensive data sets, marking a significant advancement in this domain.
科研通智能强力驱动
Strongly Powered by AbleSci AI