GCLmf: A Novel Molecular Graph Contrastive Learning Framework Based on Hard Negatives and Application in Toxicity Prediction

计算机科学 图形 机器学习 人工智能 训练集 预测能力 集合(抽象数据类型) 代理(统计) 数据挖掘 理论计算机科学 哲学 认识论 程序设计语言
作者
Xinxin Yu,Yuanting Chen,Long Chen,Weihua Li,Yuhao Wang,Yun Tang,Guixia Liu
出处
期刊:Molecular Informatics [Wiley]
标识
DOI:10.1002/minf.202400169
摘要

Abstract In silico methods for prediction of chemical toxicity can decrease the cost and increase the efficiency in the early stage of drug discovery. However, due to low accessibility of sufficient and reliable toxicity data, constructing robust and accurate prediction models is challenging. Contrastive learning, a type of self‐supervised learning, leverages large unlabeled data to obtain more expressive molecular representations, which can boost the prediction performance on downstream tasks. While molecular graph contrastive learning has gathered growing attentions, current models neglect the quality of negative data set. Here, we proposed a self‐supervised pretraining deep learning framework named GCLmf. We first utilized molecular fragments that meet specific conditions as hard negative samples to boost the quality of the negative set and thus increase the difficulty of the proxy tasks during pre‐training to learn informative representations. GCLmf has shown excellent predictive power on various molecular property benchmarks and demonstrates high performance in 33 toxicity tasks in comparison with multiple baselines. In addition, we further investigated the necessity of introducing hard negatives in model building and the impact of the proportion of hard negatives on the model.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
打打应助执着的鹏煊采纳,获得10
刚刚
Ywd驳回了SYLH应助
1秒前
2秒前
领导范儿应助ljs采纳,获得10
2秒前
lilei完成签到,获得积分10
3秒前
4秒前
7秒前
彭于晏应助可口可乐采纳,获得10
7秒前
栀然完成签到,获得积分10
9秒前
10秒前
SharonYYZ应助旷野采纳,获得10
11秒前
12秒前
13秒前
13秒前
张先生2365完成签到,获得积分10
13秒前
爆米花应助晚星采纳,获得10
14秒前
淡淡乐巧完成签到 ,获得积分10
14秒前
DrLiu完成签到,获得积分10
15秒前
cookie完成签到,获得积分10
15秒前
17秒前
想跟这个世界讲个道理完成签到,获得积分10
17秒前
具体问题具体分析完成签到,获得积分10
17秒前
yaaabo完成签到,获得积分10
17秒前
动听的半莲完成签到 ,获得积分10
18秒前
19秒前
Fazie发布了新的文献求助10
19秒前
John发布了新的文献求助10
19秒前
shrimp5215完成签到,获得积分10
20秒前
愉快西牛完成签到 ,获得积分10
20秒前
fubaozhe完成签到,获得积分10
20秒前
21秒前
Amber陆完成签到 ,获得积分10
22秒前
丘比特应助cookie采纳,获得10
22秒前
xxiao完成签到,获得积分10
22秒前
23秒前
石贵远完成签到 ,获得积分10
23秒前
23秒前
盖世兔完成签到 ,获得积分10
24秒前
24秒前
钱多多完成签到,获得积分10
24秒前
高分求助中
Introduction to Strong Mixing Conditions Volumes 1-3 500
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
Optical and electric properties of monocrystalline synthetic diamond irradiated by neutrons 320
共融服務學習指南 300
Essentials of Pharmacoeconomics: Health Economics and Outcomes Research 3rd Edition. by Karen Rascati 300
Peking Blues // Liao San 300
Political Ideologies Their Origins and Impact 13 edition 240
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3801141
求助须知:如何正确求助?哪些是违规求助? 3346809
关于积分的说明 10330527
捐赠科研通 3063158
什么是DOI,文献DOI怎么找? 1681402
邀请新用户注册赠送积分活动 807549
科研通“疑难数据库(出版商)”最低求助积分说明 763728