亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Combining Group Contribution Method and Semisupervised Learning to Build Machine Learning Models for Predicting Hydroxyl Radical Rate Constants of Water Contaminants

GCM转录因子 试验装置 集合(抽象数据类型) 均方误差 数据集 计算机科学 人工智能 机器学习 数据挖掘 统计 数学 大气环流模式 生态学 生物 气候变化 程序设计语言
作者
Zhao Liu,Lanyu Shang,Kuan Huang,Zhenrui Yue,Alan Y. Han,Dong Wang,Huichun Zhang
出处
期刊:Environmental Science & Technology [American Chemical Society]
标识
DOI:10.1021/acs.est.4c11950
摘要

Machine learning is an effective tool for predicting reaction rate constants for many organic compounds with the hydroxyl radical (HO•). Previously reported models have achieved relatively good performance, but due to scarce data (<1400 records), the applicability domain (AD) has been significantly limited. To address this limitation, we curated a much larger experimental data set (Primary data set), which contains 2358 kinetic records. We then employed both the group contribution method (GCM) and a semisupervised learning (SSL) strategy to add new data points, aiming to effectively expand the model's AD while improving model performance. The results indicated that GCM improved the model's performance for chemicals outside the AD, while SSL expanded the model's AD. The final model, after incorporating 147,168 new data points, achieved an R2 = 0.77, root-mean-square-error = 0.32, and mean-absolute-error = 0.24 on the test set. Importantly, the AD was expanded by 117% compared to the model developed solely based on the Primary data set, and the final model can be reliably applied to more than 560,000 chemicals from the DSSTox database. Further model interpretation results indicated that the model made predictions based on a correct "understanding" of the impact of key substituents and reactive sites toward HO•. This research provides an effective method for augmenting data sets, which is important in improving ML model performance and expanding AD. The final model has been made widely accessible through a free online predictor.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
17秒前
科研通AI5应助盼盼采纳,获得10
19秒前
23秒前
执着乐双发布了新的文献求助10
26秒前
xun完成签到,获得积分20
36秒前
聪明静柏完成签到 ,获得积分10
45秒前
1分钟前
盼盼发布了新的文献求助10
1分钟前
科研通AI2S应助科研通管家采纳,获得10
1分钟前
盼盼完成签到,获得积分10
1分钟前
牧紊完成签到 ,获得积分10
2分钟前
Aaron完成签到 ,获得积分0
2分钟前
善良的剑通完成签到 ,获得积分10
2分钟前
柯语雪完成签到 ,获得积分10
3分钟前
英姑应助damturexu采纳,获得10
3分钟前
Orange应助科研通管家采纳,获得10
3分钟前
4分钟前
schnappi发布了新的文献求助10
4分钟前
爆米花应助schnappi采纳,获得10
5分钟前
5分钟前
楠俊完成签到,获得积分10
5分钟前
楠俊发布了新的文献求助10
5分钟前
斯文败类应助科研通管家采纳,获得10
5分钟前
glaze关注了科研通微信公众号
5分钟前
6分钟前
执着乐双完成签到,获得积分10
6分钟前
glaze完成签到,获得积分10
7分钟前
Raul完成签到 ,获得积分10
7分钟前
Milo完成签到,获得积分10
8分钟前
充电宝应助glaze采纳,获得10
8分钟前
科研通AI5应助天庭少女采纳,获得30
9分钟前
9分钟前
天庭少女发布了新的文献求助30
9分钟前
天庭少女完成签到,获得积分10
9分钟前
10分钟前
glaze发布了新的文献求助10
10分钟前
10分钟前
高高元柏发布了新的文献求助10
10分钟前
cc完成签到,获得积分20
10分钟前
科研通AI5应助烟消云散采纳,获得10
10分钟前
高分求助中
Les Mantodea de Guyane Insecta, Polyneoptera 2500
Mobilization, center-periphery structures and nation-building 600
Technologies supporting mass customization of apparel: A pilot project 450
China—Art—Modernity: A Critical Introduction to Chinese Visual Expression from the Beginning of the Twentieth Century to the Present Day 430
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3792529
求助须知:如何正确求助?哪些是违规求助? 3336729
关于积分的说明 10282027
捐赠科研通 3053532
什么是DOI,文献DOI怎么找? 1675649
邀请新用户注册赠送积分活动 803629
科研通“疑难数据库(出版商)”最低求助积分说明 761468