Systematic tracking of nitrogen sources in complex river catchments: Machine learning approach based on microbial metagenomics

基因组 污染 随机森林 环境科学 分水岭 水质 非点源污染 生态学 计算机科学 机器学习 生物 生物化学 基因
作者
Ziqian Zhu,Junjie Ding,Ran Du,Zehua Zhang,Jiayin Guo,Xiaodong Li,Longbo Jiang,Gaojie Chen,Qiurong Bu,Ning Tang,Lan Lu,Xiang Gao,Weixiang Li,Shuai Li,Guangming Zeng,Jie Liang
出处
期刊:Water Research [Elsevier]
卷期号:253: 121255-121255 被引量:31
标识
DOI:10.1016/j.watres.2024.121255
摘要

Tracking nitrogen pollution sources is crucial for the effective management of water quality; however, it is a challenging task due to the complex contaminative scenarios in the freshwater systems. The contaminative pattern variations can induce quick responses of aquatic microorganisms, making them sensitive indicators of pollution origins. In this study, the soil and water assessment tool, accompanied by a detailed pollution source database, was used to detect the main nitrogen pollution sources in each sub-basin of the Liuyang River watershed. Thus, each sub-basin was assigned to a known class according to SWAT outputs, including point source pollution-dominated area, crop cultivation pollution-dominated area, and the septic tank pollution-dominated area. Based on these outputs, the random forest (RF) model was developed to predict the main pollution sources from different river ecosystems using a series of input variable groups (e.g., natural macroscopic characteristics, river physicochemical properties, 16S rRNA microbial taxonomic composition, microbial metagenomic data containing taxonomic and functional information, and their combination). The accuracy and the Kappa coefficient were used as the performance metrics for the RF model. Compared with the prediction performance among all the input variable groups, the prediction performance of the RF model was significantly improved using metagenomic indices as inputs. Among the metagenomic data-based models, the combination of the taxonomic information with functional information of all the species achieved the highest accuracy (0.84) and increased median Kappa coefficient (0.70). Feature importance analysis was used to identify key features that could serve as indicators for sudden pollution accidents and contribute to the overall function of the river system. The bacteria Rhabdochromatium marinum, Frankia, Actinomycetia, and Competibacteraceae were the most important species, whose mean decrease Gini indices were 0.0023, 0.0021, 0.0019, and 0.0018, respectively, although their relative abundances ranged only from 0.0004 to 0.1 %. Among the top 30 important variables, functional variables constituted more than half, demonstrating the remarkable variation in the microbial functions among sites with distinct pollution sources and the key role of functionality in predicting pollution sources. Many functional indicators related to the metabolism of Mycobacterium tuberculosis, such as K24693, K25621, K16048, and K14952, emerged as significant important factors in distinguishing nitrogen pollution origins. With the shortage of pollution source data in developing regions, this suggested approach offers an economical, quick, and accurate solution to locate the origins of water nitrogen pollution using the metagenomic data of microbial communities.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
忆年慧逝完成签到,获得积分10
1秒前
1秒前
limz完成签到,获得积分10
1秒前
1秒前
2秒前
所所应助健壮洋葱采纳,获得10
2秒前
2秒前
JamesPei应助哈尼采纳,获得10
2秒前
姗姗发布了新的文献求助10
3秒前
饱满的琦完成签到,获得积分10
3秒前
3秒前
俊逸松完成签到 ,获得积分10
3秒前
爆米花应助鲁滨逊采纳,获得10
4秒前
4秒前
酷波er应助qqq采纳,获得10
4秒前
5秒前
胡俊完成签到,获得积分10
5秒前
daiyu发布了新的文献求助10
6秒前
6秒前
7秒前
LDD完成签到,获得积分20
7秒前
Young_kristine完成签到,获得积分10
7秒前
drfwjuikesv发布了新的文献求助10
7秒前
忆年慧逝发布了新的文献求助10
7秒前
8秒前
俊逸松关注了科研通微信公众号
9秒前
康兴宇完成签到,获得积分10
9秒前
10秒前
略略略发布了新的文献求助10
10秒前
11秒前
liyu完成签到 ,获得积分10
11秒前
考博圣体发布了新的文献求助10
11秒前
星希完成签到 ,获得积分10
11秒前
wlscj举报高大的幻枫求助涉嫌违规
13秒前
efls发布了新的文献求助10
13秒前
daiyu完成签到,获得积分10
13秒前
13秒前
Ruby完成签到,获得积分10
13秒前
zhang123发布了新的文献求助10
13秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Fermented Coffee Market 2000
微纳米加工技术及其应用 500
Constitutional and Administrative Law 500
PARLOC2001: The update of loss containment data for offshore pipelines 500
Critical Thinking: Tools for Taking Charge of Your Learning and Your Life 4th Edition 500
Vertebrate Palaeontology, 5th Edition 420
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5287680
求助须知:如何正确求助?哪些是违规求助? 4439796
关于积分的说明 13823033
捐赠科研通 4321964
什么是DOI,文献DOI怎么找? 2372222
邀请新用户注册赠送积分活动 1367807
关于科研通互助平台的介绍 1331322