Identifying mislabelled samples: Machine learning models exceed human performance

机器学习 人工智能 计算机科学 决策树 人工神经网络 支持向量机 随机森林 逻辑回归 梯度升压
作者
Christopher‐John L. Farrell
出处
期刊:Annals of Clinical Biochemistry [SAGE Publishing]
卷期号:58 (6): 650-652 被引量:23
标识
DOI:10.1177/00045632211032991
摘要

Background It is difficult for clinical laboratories to identify samples that are labelled with the details of an incorrect patient. Many laboratories screen for these errors with delta checks, with final decision-making based on manual review of results by laboratory staff. Machine learning models have been shown to outperform delta checks for identifying these errors. However, a comparison of machine learning models to human-level performance has not yet been made. Methods Deidentified data for current and previous (within seven days) electrolytes, urea and creatinine results was used in the computer simulation of mislabelled samples. Eight different machine learning models were developed on 127,256 sets of results using different algorithms: artificial neural network, extreme gradient boosting, support vector machine, random forest, logistic regression, k-nearest neighbours and two decision trees (one complex and one simple). A separate test data-set ( n = 14,140) was used to evaluate the performance of these models as well as laboratory staff volunteers, who manually reviewed a random subset of this data ( n = 500). Results The best performing machine learning model was the artificial neural network (92.1% accuracy), with the simple decision tree demonstrating the poorest accuracy (86.5%). The accuracy of laboratory staff for identifying mislabelled samples was 77.8%. Conclusions The results of this preliminary investigation suggest that even relatively simple machine learning models can exceed human performance for identifying mislabelled samples. Machine learning techniques should be considered for implementation in clinical laboratories to assist with error identification.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
LLL完成签到,获得积分10
刚刚
jenningseastera应助积极以云采纳,获得10
1秒前
科研通AI5应助活泼的沛山采纳,获得10
2秒前
悦己发布了新的文献求助10
2秒前
3秒前
YZ完成签到 ,获得积分10
4秒前
假萌完成签到,获得积分10
4秒前
4秒前
博士后应助李联洪采纳,获得10
4秒前
zzmp完成签到,获得积分10
4秒前
共享精神应助ffyzsl采纳,获得10
6秒前
6秒前
临水思长完成签到,获得积分10
7秒前
kobeliu发布了新的文献求助10
7秒前
ALL发布了新的文献求助10
7秒前
8秒前
DT完成签到,获得积分10
8秒前
YJH完成签到,获得积分10
9秒前
9秒前
11秒前
所所应助气球采纳,获得10
11秒前
可爱的函函应助HUO采纳,获得10
11秒前
微笑冰淇淋完成签到,获得积分10
11秒前
彩虹儿完成签到,获得积分10
12秒前
RC发布了新的文献求助10
12秒前
半夏完成签到,获得积分10
12秒前
Enoelle发布了新的文献求助30
12秒前
12秒前
活泼的沛山完成签到,获得积分20
13秒前
Ooops完成签到,获得积分10
15秒前
16秒前
隐形曼青应助kobeliu采纳,获得10
16秒前
李爱国应助兀那狗子别跑采纳,获得10
16秒前
西子阳发布了新的文献求助10
16秒前
寒冷威发布了新的文献求助10
17秒前
sirius发布了新的文献求助10
17秒前
luyuan完成签到 ,获得积分10
18秒前
CipherSage应助Kansny采纳,获得10
18秒前
lin完成签到,获得积分20
21秒前
高分求助中
Manipulating the Mouse Embryo: A Laboratory Manual, Fourth Edition 1000
Determination of the boron concentration in diamond using optical spectroscopy 600
The Netter Collection of Medical Illustrations: Digestive System, Volume 9, Part III - Liver, Biliary Tract, and Pancreas (3rd Edition) 600
Founding Fathers The Shaping of America 500
A new house rat (Mammalia: Rodentia: Muridae) from the Andaman and Nicobar Islands 500
Writing to the Rhythm of Labor Cultural Politics of the Chinese Revolution, 1942–1976 300
On the Validity of the Independent-Particle Model and the Sum-rule Approach to the Deeply Bound States in Nuclei 220
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 催化作用 遗传学 冶金 电极 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 4547686
求助须知:如何正确求助?哪些是违规求助? 3978585
关于积分的说明 12319234
捐赠科研通 3647114
什么是DOI,文献DOI怎么找? 2008560
邀请新用户注册赠送积分活动 1044062
科研通“疑难数据库(出版商)”最低求助积分说明 932684