Avoiding common machine learning pitfalls

机器学习 人工智能 计算机科学 过程(计算) 标准化 操作系统
作者
Michael A. Lones
出处
期刊:Patterns [Elsevier BV]
卷期号:5 (10): 101046-101046 被引量:7
标识
DOI:10.1016/j.patter.2024.101046
摘要

The bigger pictureMachine learning has transitioned from a niche pursuit to one with mass appeal. Thanks to the accessibility of modern machine learning tools, it is now very easy to get started in machine learning, yet this ease of use masks the underlying complexities of doing machine learning. This, coupled with a relatively inexperienced community of practitioners, has led to flawed practices, which are reflected in issues such as poor reproducibility within machine-learning-based studies.This tutorial aims to address this problem by educating practitioners about the many things that can go wrong when applying machine learning and providing guidance on how to avoid these pitfalls. However, this is just part of the longer-term process that is needed to improve practice, as machine learning will only meet its ambitions if it is able to become a robust and trusted applied discipline. Other factors that have a role to play in this include better tools, standardization, and regulation.SummaryMistakes in machine learning practice are commonplace and can result in loss of confidence in the findings and products of machine learning. This tutorial outlines common mistakes that occur when using machine learning and what can be done to avoid them. While it should be accessible to anyone with a basic understanding of machine learning techniques, it focuses on issues that are of particular concern within academic research, such as the need to make rigorous comparisons and reach valid conclusions. It covers five stages of the machine learning process: what to do before model building, how to reliably build models, how to robustly evaluate models, how to compare models fairly, and how to report results.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
李爱国应助汝桢采纳,获得10
刚刚
刚刚
丰富的诗槐完成签到,获得积分20
刚刚
笒婄关注了科研通微信公众号
3秒前
斯文败类应助代秋采纳,获得10
4秒前
匹诺曹发布了新的文献求助10
4秒前
Aurora关注了科研通微信公众号
4秒前
彭于晏应助念l采纳,获得10
6秒前
8秒前
TYY完成签到,获得积分10
8秒前
科研通AI6.3应助ranqi采纳,获得10
9秒前
9秒前
xu完成签到,获得积分10
10秒前
许一朝完成签到 ,获得积分10
12秒前
123完成签到,获得积分10
13秒前
FashionBoy应助飘逸凝丝采纳,获得10
14秒前
15秒前
汝桢发布了新的文献求助10
15秒前
匹诺曹完成签到 ,获得积分10
17秒前
代秋发布了新的文献求助10
18秒前
隐形曼青应助QI一往情深采纳,获得10
19秒前
karyoter完成签到,获得积分10
19秒前
19秒前
21秒前
21秒前
molihuakai应助flshxjiaaf采纳,获得10
21秒前
排骨大王完成签到 ,获得积分10
22秒前
wu发布了新的文献求助10
25秒前
李洪星发布了新的文献求助10
25秒前
扎根发布了新的文献求助10
27秒前
6666应助nihaoaaaa采纳,获得10
28秒前
ding应助yingying采纳,获得10
28秒前
butter发布了新的文献求助10
29秒前
111关注了科研通微信公众号
29秒前
戌博完成签到,获得积分10
30秒前
31秒前
34秒前
36秒前
sy发布了新的文献求助10
37秒前
yingying完成签到,获得积分10
37秒前
高分求助中
Clinical Epidemiology: The Essentials, 6e 10000
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
The Graphene Handbook (2019 Edition) 800
Adhesion Science: Principles & Practice 800
Signals, Systems, and Signal Processing 610
IEST-RP-CC018: Cleanroom Cleaning and Sanitization: Operating and Monitoring Procedures 600
Fundamentals of Pharmaceutical and Biologics Regulations: A Global Perspective, Second Edition 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6543490
求助须知:如何正确求助?哪些是违规求助? 8333229
关于积分的说明 17857495
捐赠科研通 5650934
什么是DOI,文献DOI怎么找? 2937010
邀请新用户注册赠送积分活动 1913285
关于科研通互助平台的介绍 1775374