False Discovery Rate Control via Data Splitting

错误发现率 计算机科学 数学 数据挖掘 统计 生物 生物化学 基因
作者
Chenguang Dai,Buyu Lin,Xin Xing,Jun S. Liu
标识
DOI:10.1080/01621459.2022.2060113
摘要

Selecting relevant features associated with a given response variable is an important problem in many scientific fields. Quantifying quality and uncertainty of a selection result via false discovery rate (FDR) control has been of recent interest. This article introduces a data-splitting method (referred to as “DS”) to asymptotically control the FDR while maintaining a high power. For each feature, DS constructs a test statistic by estimating two independent regression coefficients via data splitting. FDR control is achieved by taking advantage of the statistic’s property that, for any null feature, its sampling distribution is symmetric about zero; whereas for a relevant feature, its sampling distribution has a positive mean. Furthermore, a Multiple Data Splitting (MDS) method is proposed to stabilize the selection result and boost the power. Surprisingly, with the FDR under control, MDS not only helps overcome the power loss caused by data splitting, but also results in a lower variance of the false discovery proportion (FDP) compared with all other methods in consideration. Extensive simulation studies and a real-data application show that the proposed methods are robust to the unknown distribution of features, easy to implement and computationally efficient, and are often the most powerful ones among competitors especially when the signals are weak and correlations or partial correlations among features are high. Supplementary materials for this article are available online.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
2秒前
3秒前
3秒前
机灵香芦发布了新的文献求助10
3秒前
打打应助勤劳的寒烟采纳,获得10
4秒前
sheep完成签到 ,获得积分10
4秒前
4秒前
小仙女完成签到,获得积分10
4秒前
猫抓板完成签到,获得积分10
5秒前
Owen应助美满烨霖采纳,获得10
6秒前
7秒前
搜集达人应助MosenL采纳,获得10
8秒前
赵正洁发布了新的文献求助10
8秒前
李健应助眼药水采纳,获得10
8秒前
lzl发布了新的文献求助10
8秒前
9秒前
Rimbaud完成签到 ,获得积分10
9秒前
11秒前
11秒前
欣喜柚子完成签到 ,获得积分10
11秒前
大力的灵雁应助arniu2008采纳,获得10
11秒前
顾矜应助留胡子的火采纳,获得10
11秒前
lion发布了新的文献求助10
12秒前
lele发布了新的文献求助20
12秒前
chen完成签到,获得积分10
12秒前
孤岛飞鹰完成签到,获得积分10
13秒前
无奈的丸子完成签到,获得积分10
13秒前
13秒前
13秒前
15秒前
敏感的熊猫完成签到 ,获得积分10
15秒前
饺子完成签到,获得积分10
16秒前
16秒前
大力的灵雁应助arniu2008采纳,获得10
18秒前
kimi发布了新的文献求助10
18秒前
研友_VZG7GZ应助guojingjing采纳,获得10
19秒前
pan完成签到,获得积分10
20秒前
20秒前
情怀应助笨笨豌豆采纳,获得10
21秒前
猫抓板发布了新的文献求助10
21秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Salmon nasal cartilage-derived proteoglycan complexes influence the gut microbiota and bacterial metabolites in mice 2000
The Composition and Relative Chronology of Dynasties 16 and 17 in Egypt 1500
Cowries - A Guide to the Gastropod Family Cypraeidae 1200
Hemispherical Resonator Gyro: Status Report and Test Results 800
ON THE THEORY OF BIRATIONAL BLOWING-UP 666
Signals, Systems, and Signal Processing 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6383764
求助须知:如何正确求助?哪些是违规求助? 8195965
关于积分的说明 17330171
捐赠科研通 5437419
什么是DOI,文献DOI怎么找? 2875712
邀请新用户注册赠送积分活动 1852351
关于科研通互助平台的介绍 1696731