Robust prediction of colorectal cancer via gut microbiome 16S rRNA sequencing data

微生物群 随机森林 人体微生物群 人工智能 机器学习 结直肠癌 计算机科学 计算生物学 基因组 分类器(UML) 生物 人类微生物组计划 生物信息学 癌症 遗传学 基因
作者
Annamaria Porreca,Eliana Ibrahimi,Fabrizio Maturo,Laura Judith Marcos-Zambrano,Melisa Meto,Marta B. Lopes
出处
期刊:Journal of Medical Microbiology [Microbiology Society]
卷期号:73 (10)
标识
DOI:10.1099/jmm.0.001903
摘要

Introduction. The study addresses the challenge of utilizing human gut microbiome data for the early detection of colorectal cancer (CRC). The research emphasizes the potential of using machine learning techniques to analyze complex microbiome datasets, providing a non-invasive approach to identifying CRC-related microbial markers. Hypothesis/Gap Statement. The primary hypothesis is that a robust machine learning-based analysis of 16S rRNA microbiome data can identify specific microbial features that serve as effective biomarkers for CRC detection, overcoming the limitations of classical statistical models in high-dimensional settings. Aim. The primary objective of this study is to explore and validate the potential of the human microbiome, specifically in the colon, as a valuable source of biomarkers for colorectal cancer (CRC) detection and progression. The focus is on developing a classifier that effectively predicts the presence of CRC and normal samples based on the analysis of three previously published faecal 16S rRNA sequencing datasets. Methodology. To achieve the aim, various machine learning techniques are employed, including random forest (RF), recursive feature elimination (RFE) and a robust correlation-based technique known as the fuzzy forest (FF). The study utilizes these methods to analyse the three datasets, comparing their performance in predicting CRC and normal samples. The emphasis is on identifying the most relevant microbial features (taxa) associated with CRC development via partial dependence plots, i.e. a machine learning tool focused on explainability, visualizing how a feature influences the predicted outcome. Results. The analysis of the three faecal 16S rRNA sequencing datasets reveals the consistent and superior predictive performance of the FF compared to the RF and RFE. Notably, FF proves effective in addressing the correlation problem when assessing the importance of microbial taxa in explaining the development of CRC. The results highlight the potential of the human microbiome as a non-invasive means to detect CRC and underscore the significance of employing FF for improved predictive accuracy. Conclusion. In conclusion, this study underscores the limitations of classical statistical techniques in handling high-dimensional information such as human microbiome data. The research demonstrates the potential of the human microbiome, specifically in the colon, as a valuable source of biomarkers for CRC detection. Applying machine learning techniques, particularly the FF, is a promising approach for building a classifier to predict CRC and normal samples. The findings advocate for integrating FF to overcome the challenges associated with correlation when identifying crucial microbial features linked to CRC development.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
小卢卢快闭嘴完成签到,获得积分10
刚刚
Akim应助迷路谷蓝采纳,获得10
1秒前
jjjmy完成签到,获得积分20
1秒前
貌貌完成签到,获得积分10
2秒前
2秒前
SciGPT应助Fjun采纳,获得10
3秒前
Tree_QD发布了新的文献求助10
3秒前
keke完成签到,获得积分10
4秒前
mymEN发布了新的文献求助30
4秒前
2058753794发布了新的文献求助10
5秒前
沭阳检验医师完成签到,获得积分0
6秒前
CodeCraft应助beleve采纳,获得10
8秒前
孟龙威发布了新的文献求助10
8秒前
哒哒哒完成签到,获得积分10
9秒前
10秒前
完美世界应助jscshoping采纳,获得10
12秒前
耗子完成签到,获得积分10
12秒前
dl应助神勇的绮烟采纳,获得20
13秒前
efls发布了新的文献求助10
15秒前
mrmr完成签到,获得积分10
17秒前
ding应助TomTang采纳,获得30
18秒前
19秒前
沉静笑蓝完成签到,获得积分10
19秒前
19秒前
拖沓李天王完成签到,获得积分10
19秒前
beleve完成签到,获得积分10
21秒前
pan完成签到,获得积分10
21秒前
21秒前
我爱乒乓球完成签到 ,获得积分10
21秒前
木子予安完成签到,获得积分20
21秒前
泡泡桔完成签到,获得积分10
24秒前
beleve发布了新的文献求助10
24秒前
jscshoping发布了新的文献求助10
24秒前
26秒前
26秒前
26秒前
27秒前
张贵川发布了新的文献求助10
27秒前
28秒前
高分求助中
Psychopathic Traits and Quality of Prison Life 1000
Malcolm Fraser : a biography 680
Signals, Systems, and Signal Processing 610
天津市智库成果选编 600
Forced degradation and stability indicating LC method for Letrozole: A stress testing guide 500
全相对论原子结构与含时波包动力学的理论研究--清华大学 500
A Foreign Missionary on the Long March: The Unpublished Memoirs of Arnolis Hayman of the China Inland Mission 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6453971
求助须知:如何正确求助?哪些是违规求助? 8265072
关于积分的说明 17614898
捐赠科研通 5519499
什么是DOI,文献DOI怎么找? 2904577
邀请新用户注册赠送积分活动 1881250
关于科研通互助平台的介绍 1723868