置信区间
统计
流行病学
统计能力
统计显著性
无效假设
统计假设检验
p值
人口学
数学
分布(数学)
医学
统计模型
再现性
计量经济学
内科学
数学分析
社会学
作者
Sarah F. Ackley,Ryan M. Andrews,Christopher Seaman,Michael Flanders,Ruijia Chen,Jingxuan Wang,Gildete Barreto Lopes,Kendra D. Sims,Peter Buto,Erin Ferguson,Isabel Elaine Allen,M. Maria Glymour
摘要
Abstract Epidemiologists have advocated for reporting confidence intervals and deemphasizing P values to address long-standing concerns about null-hypothesis statistical-significance testing, P hacking, and reproducibility. It is unknown if efforts to reduce reliance on P values have altered the distribution of P values. For 21 332 abstracts published 2000 to 2024 in 4 major epidemiology journals, two-sided P values (N = 25 288) were calculated from estimates and confidence intervals scraped using ChatGPT’s 4o-mini model. We evaluated trends over time to determine whether the empirical distribution of P values changed. We fitted to expected P-value distributions and simulated these distributions with and without assuming changes in statistical power over time. Average P values decreased from 2000 to 2024; the fraction of P values just below the .05 threshold also decreased. Fits to models indicate that statistical power increased. Increasing power would reduce average P value while also decreasing P values near the .05 threshold—precisely the trends observed in epidemiology journals. Although the frequency of P values near the .05 threshold has declined modestly, this likely reflects increases in statistical power rather than decreases in P hacking.
科研通智能强力驱动
Strongly Powered by AbleSci AI