Multi-task pretrained language model with novel application domains enables more comprehensive health and ecological toxicity prediction

任务(项目管理) 计算机科学 毒性 人工智能 机器学习 自然语言处理 工程类 化学 系统工程 有机化学
作者
Zhichao Tan,Youcai Zhao,Kunsen Lin,Tao Zhou
出处
期刊:Journal of Hazardous Materials [Elsevier BV]
卷期号:477: 135265-135265 被引量:1
标识
DOI:10.1016/j.jhazmat.2024.135265
摘要

In silico models for screening substances of healthy and ecological concern are essential for effective chemical management. However, current data-driven toxicity prediction models confront formidable challenges related to expressive capacity, data scarcity, and reliability issues. Thus, this study introduces TOX-BERT, a SMILES-based pretrained model for screening health and ecological toxicity. Results show that masked atom recovery pretraining and multi-task learning offer promising solutions to enhance model capacity and address data scarcity issues. Two novel application domain (AD) parameters, termed PCA-AD and LDS, were proposed to improve prediction reliability of TOX-BERT with accuracy surpassing 90 % and mean absolute error (MAE) below 0.52. TOX-BERT was applied to 18,905 IECSC chemicals, revealing distinct toxicity relationships that align with experimental studies such as those between cardiotoxicity and acute ecotoxicity. In addition to previous PBT screening, 156 potential high-risk chemicals for specific endpoint were identified covering 7 categories. Furthermore, a SMILES-based toxicity site detection approach was developed for structural toxicity analysis. These advancements carry profound implications to address challenges faced by current data-driven toxicity prediction models. TOX-BERT emerges as a valuable tool for more comprehensive, reliable, and applicable predictions of health and ecological toxicity in chemical risk assessment and management.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
念l完成签到,获得积分10
1秒前
kiterunner完成签到,获得积分10
3秒前
大个应助抱月亮ma采纳,获得10
5秒前
落后访风完成签到,获得积分10
8秒前
8秒前
Billy发布了新的文献求助10
8秒前
天宇完成签到,获得积分20
9秒前
苏言发布了新的文献求助10
9秒前
11秒前
Summer完成签到,获得积分20
11秒前
林狗发布了新的文献求助10
12秒前
想做科研完成签到 ,获得积分20
13秒前
13秒前
小包应助JUGG采纳,获得10
14秒前
14秒前
14秒前
cen完成签到,获得积分10
15秒前
Lucas应助苹果白凡采纳,获得100
15秒前
小二郎应助Summer采纳,获得10
17秒前
平凡发布了新的文献求助10
17秒前
hui发布了新的文献求助10
17秒前
张1完成签到,获得积分20
17秒前
脑洞疼应助怕孤独的语兰采纳,获得10
20秒前
林狗发布了新的文献求助10
22秒前
Mistletoe完成签到 ,获得积分10
23秒前
笨笨如之完成签到 ,获得积分10
23秒前
科研通AI6应助hui采纳,获得10
24秒前
25秒前
26秒前
651952发布了新的文献求助10
26秒前
Lucas应助张1采纳,获得10
27秒前
饶清萍发布了新的文献求助10
27秒前
28秒前
大个应助lifengxia采纳,获得10
28秒前
一语道破关注了科研通微信公众号
29秒前
olivia完成签到,获得积分20
30秒前
111完成签到,获得积分10
30秒前
31秒前
我不爱池鱼应助JUGG采纳,获得10
31秒前
du199944发布了新的文献求助10
32秒前
高分求助中
(应助此贴封号)【重要!!请各位详细阅读】【科研通的精品贴汇总】 10000
First Farmers: The Origins of Agricultural Societies, 2nd Edition 500
Assessment of adverse effects of Alzheimer's disease medications: Analysis of notifications to Regional Pharmacovigilance Centers in Northwest France 400
Toward a systemic functional framework for the multimodal analysis of meaning reconstruction 200
Absent Here 200
Encyclopedia of Renewable Energy, Sustainability and the Environment Volume 1: Sustainable Development and Bioenergy Solutions 200
Zentrumsmannigfaltigkeiten für quasilineare parabolische Gleichungen 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4348117
求助须知:如何正确求助?哪些是违规求助? 3854026
关于积分的说明 12029235
捐赠科研通 3495818
什么是DOI,文献DOI怎么找? 1918080
邀请新用户注册赠送积分活动 960887
科研通“疑难数据库(出版商)”最低求助积分说明 860638