Evaluating Machine Learning Methods of Analyzing Multiclass Metabolomics

规范化(社会学) 计算机科学 人工智能 机器学习 代谢组学 多类分类 缺少数据 数据库规范化 数据挖掘 插补(统计学) 数据集 模式识别(心理学) 支持向量机 生物信息学 社会学 生物 人类学
作者
Yaguo Gong,Wei Ding,Panpan Wang,Qibiao Wu,Xiaojun Yao,Qingxia Yang
出处
期刊:Journal of Chemical Information and Modeling [American Chemical Society]
卷期号:63 (24): 7628-7641 被引量:12
标识
DOI:10.1021/acs.jcim.3c01525
摘要

Multiclass metabolomic studies have become popular for revealing the differences in multiple stages of complex diseases, various lifestyles, or the effects of specific treatments. In multiclass metabolomics, there are multiple data manipulation steps for analyzing raw data, which consist of data filtering, the imputation of missing values, data normalization, marker identification, sample separation, classification, and so on. In each step, several to dozens of machine learning methods can be chosen for the given data set, with potentially hundreds or thousands of method combinations in the whole data processing chain. Therefore, a clear understanding of these machine learning methods is helpful for selecting an appropriate method combination for obtaining stable and reliable analytical results of specific data. However, there has rarely been an overall introduction or evaluation of these methods based on multiclass metabolomic data. Herein, detailed descriptions of these machine learning methods in multiple data manipulation steps are reviewed. Moreover, an assessment of these methods was performed using a benchmark data set for multiclass metabolomics. First, 12 imputation methods for imputing missing values were evaluated based on the PSS (Procrustes statistical shape analysis) and NRMSE (normalized root-mean-square error) values. Second, 17 normalization methods for processing multiclass metabolomic data were evaluated by applying the PMAD (pooled median absolute deviation) value. Third, different methods of identifying markers of multiclass metabolomics were evaluated based on the CWrel (relative weighted consistency) value. Fourth, nine classification methods for constructing multiclass models were assessed using the AUC (area under the curve) value. Performance evaluations of machine learning methods are highly recommended to select the most appropriate method combination before performing the final analysis of the given data. Overall, detailed descriptions and evaluation of various machine learning methods are expected to improve analyses of multiclass metabolomic data.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
guangweiyan发布了新的文献求助10
刚刚
Mrmao0213发布了新的文献求助10
1秒前
1秒前
1秒前
1秒前
ZKang发布了新的文献求助10
1秒前
李白发布了新的文献求助10
2秒前
2秒前
善学以致用应助润润轩轩采纳,获得10
3秒前
李爱国应助平淡雪枫采纳,获得10
3秒前
麻花精发布了新的文献求助10
3秒前
石头完成签到,获得积分10
3秒前
北北北应助n5421采纳,获得10
4秒前
Ican发布了新的文献求助10
4秒前
科目三应助236采纳,获得10
4秒前
4秒前
yimi发布了新的文献求助10
4秒前
dinghaifeng完成签到,获得积分10
5秒前
SciGPT应助一二采纳,获得10
6秒前
wentyli完成签到,获得积分10
6秒前
Sharon发布了新的文献求助10
6秒前
6秒前
7秒前
7秒前
君莫笑完成签到,获得积分10
7秒前
光亮向真完成签到,获得积分10
7秒前
7秒前
肖肖完成签到,获得积分10
8秒前
王京完成签到,获得积分10
8秒前
doukeze发布了新的文献求助20
8秒前
zzh完成签到 ,获得积分10
9秒前
9秒前
9秒前
小祺发布了新的文献求助10
9秒前
9秒前
风清扬发布了新的文献求助10
10秒前
走蛋吧发布了新的文献求助10
11秒前
gexiaoyang发布了新的文献求助10
11秒前
NexusExplorer应助麻花精采纳,获得10
11秒前
美伢完成签到,获得积分10
11秒前
高分求助中
(应助此贴封号)【重要!!请各位详细阅读】【科研通的精品贴汇总】 10000
F-35B V2.0 How to build Kitty Hawk's F-35B Version 2.0 Model 2000
줄기세포 생물학 1000
Biodegradable Embolic Microspheres Market Insights 888
Quantum reference frames : from quantum information to spacetime 888
INQUIRY-BASED PEDAGOGY TO SUPPORT STEM LEARNING AND 21ST CENTURY SKILLS: PREPARING NEW TEACHERS TO IMPLEMENT PROJECT AND PROBLEM-BASED LEARNING 500
2025-2031全球及中国蛋黄lgY抗体行业研究及十五五规划分析报告(2025-2031 Global and China Chicken lgY Antibody Industry Research and 15th Five Year Plan Analysis Report) 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4464451
求助须知:如何正确求助?哪些是违规求助? 3926839
关于积分的说明 12185827
捐赠科研通 3579582
什么是DOI,文献DOI怎么找? 1966828
邀请新用户注册赠送积分活动 1005440
科研通“疑难数据库(出版商)”最低求助积分说明 899722