亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Statistical strategies for avoiding false discoveries in metabolomics and related experiments

过度拟合 计算机科学 错误发现率 多重比较问题 虚假关系 单变量 代谢组学 统计能力 统计假设检验 样本量测定 生物标志物发现 集合(抽象数据类型) 生物 数据挖掘 假阳性悖论 机器学习 人工智能 生物信息学 多元统计 统计 蛋白质组学 数学 生物化学 人工神经网络 基因 程序设计语言
作者
David Broadhurst,Douglas B. Kell
出处
期刊:Metabolomics [Springer Science+Business Media]
卷期号:2 (4): 171-196 被引量:797
标识
DOI:10.1007/s11306-006-0037-z
摘要

Many metabolomics, and other high-content or high-throughput, experiments are set up such that the primary aim is the discovery of biomarker metabolites that can discriminate, with a certain level of certainty, between nominally matched ‘case’ and ‘control’ samples. However, it is unfortunately very easy to find markers that are apparently persuasive but that are in fact entirely spurious, and there are well-known examples in the proteomics literature. The main types of danger are not entirely independent of each other, but include bias, inadequate sample size (especially relative to the number of metabolite variables and to the required statistical power to prove that a biomarker is discriminant), excessive false discovery rate due to multiple hypothesis testing, inappropriate choice of particular numerical methods, and overfitting (generally caused by the failure to perform adequate validation and cross-validation). Many studies fail to take these into account, and thereby fail to discover anything of true significance (despite their claims). We summarise these problems, and provide pointers to a substantial existing literature that should assist in the improved design and evaluation of metabolomics experiments, thereby allowing robust scientific conclusions to be drawn from the available data. We provide a list of some of the simpler checks that might improve one’s confidence that a candidate biomarker is not simply a statistical artefact, and suggest a series of preferred tests and visualisation tools that can assist readers and authors in assessing papers. These tools can be applied to individual metabolites by using multiple univariate tests performed in parallel across all metabolite peaks. They may also be applied to the validation of multivariate models. We stress in particular that classical p-values such as “p < 0.05”, that are often used in biomedicine, are far too optimistic when multiple tests are done simultaneously (as in metabolomics). Ultimately it is desirable that all data and metadata are available electronically, as this allows the entire community to assess conclusions drawn from them. These analyses apply to all high-dimensional ‘omics’ datasets.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Copyright应助科研通管家采纳,获得10
13秒前
LIUDEHUA完成签到,获得积分10
18秒前
bbhk完成签到,获得积分10
19秒前
30秒前
dzll发布了新的文献求助10
34秒前
dzll完成签到,获得积分10
42秒前
尘香如故完成签到 ,获得积分10
59秒前
1分钟前
vv完成签到,获得积分10
1分钟前
vv发布了新的文献求助10
1分钟前
一杯奶昔完成签到 ,获得积分10
1分钟前
2分钟前
2分钟前
大胆的鲂发布了新的文献求助10
2分钟前
zzz完成签到,获得积分10
2分钟前
Copyright应助科研通管家采纳,获得10
2分钟前
xjcy应助大胆的鲂采纳,获得10
2分钟前
SciGPT应助大胆的鲂采纳,获得10
2分钟前
HalloYa完成签到 ,获得积分10
2分钟前
2分钟前
大胆的鲂完成签到,获得积分20
2分钟前
芜湖发布了新的文献求助10
2分钟前
芜湖完成签到,获得积分10
2分钟前
3分钟前
3分钟前
会飞的yu发布了新的文献求助10
3分钟前
今后应助复杂黑夜采纳,获得10
3分钟前
今后应助1234采纳,获得10
3分钟前
天人合一完成签到,获得积分0
3分钟前
Copyright应助科研通管家采纳,获得10
4分钟前
无花果应助zhn采纳,获得10
4分钟前
Hello应助zhn采纳,获得10
4分钟前
科研通AI6.3应助zhn采纳,获得10
4分钟前
SciGPT应助zhn采纳,获得10
4分钟前
小辣椒完成签到,获得积分10
4分钟前
zhaodan完成签到,获得积分10
4分钟前
奋斗的铅笔完成签到 ,获得积分10
4分钟前
guyuzheng完成签到,获得积分10
4分钟前
爱听歌谷蓝完成签到,获得积分10
4分钟前
魔幻的芳完成签到,获得积分10
4分钟前
高分求助中
Principles of Economics, 11th Edition 10000
Prescott's Microbiology: 2026 Release ISE 10000
University Physics with Modern Physics, 16th edition 10000
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Environmental Leverage in Times of Climate Crisis: Product Standards, Carbon Border Measures and Preferential Trade Agreements 1000
Erwählung und Berufung bei Paulus: Bedeutung, Entwicklung und Funktion einer Vorstellung in ihrem frühjüdischen und griechisch-römischen Kontext 850
Matrix Methods in Data Mining and Pattern Recognition 510
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 内科学 物理 复合材料 催化作用 细胞生物学 无机化学 光电子学 物理化学 电极 基因
热门帖子
关注 科研通微信公众号,转发送积分 7201153
求助须知:如何正确求助?哪些是违规求助? 8835640
关于积分的说明 18650145
捐赠科研通 6843943
什么是DOI,文献DOI怎么找? 3178905
关于科研通互助平台的介绍 2335180
邀请新用户注册赠送积分活动 2153371