Evaluation of Missing Data Analytical Techniques in Longitudinal Research: Traditional and Machine Learning Approaches

缺少数据 纵向数据 计算机科学 机器学习 人工智能 数据科学 数据挖掘
作者
Dandan Tang,Xin Tong
出处
期刊:Cornell University - arXiv
标识
DOI:10.48550/arxiv.2406.13814
摘要

Missing Not at Random (MNAR) and nonnormal data are challenging to handle. Traditional missing data analytical techniques such as full information maximum likelihood estimation (FIML) may fail with nonnormal data as they are built on normal distribution assumptions. Two-Stage Robust Estimation (TSRE) does manage nonnormal data, but both FIML and TSRE are less explored in longitudinal studies under MNAR conditions with nonnormal distributions. Unlike traditional statistical approaches, machine learning approaches do not require distributional assumptions about the data. More importantly, they have shown promise for MNAR data; however, their application in longitudinal studies, addressing both Missing at Random (MAR) and MNAR scenarios, is also underexplored. This study utilizes Monte Carlo simulations to assess and compare the effectiveness of six analytical techniques for missing data within the growth curve modeling framework. These techniques include traditional approaches like FIML and TSRE, machine learning approaches by single imputation (K-Nearest Neighbors and missForest), and machine learning approaches by multiple imputation (micecart and miceForest). We investigate the influence of sample size, missing data rate, missing data mechanism, and data distribution on the accuracy and efficiency of model estimation. Our findings indicate that FIML is most effective for MNAR data among the tested approaches. TSRE excels in handling MAR data, while missForest is only advantageous in limited conditions with a combination of very skewed distributions, very large sample sizes (e.g., n larger than 1000), and low missing data rates.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
刚刚
埋头赶路应助123采纳,获得10
刚刚
lcj发布了新的文献求助10
刚刚
Bsisoy发布了新的文献求助10
刚刚
能干妙竹完成签到,获得积分10
刚刚
CMQ2021102261完成签到,获得积分10
刚刚
1秒前
wgg完成签到,获得积分20
1秒前
1秒前
1秒前
1秒前
1秒前
山野下完成签到,获得积分10
1秒前
王小爱完成签到,获得积分10
2秒前
serendipity完成签到,获得积分10
2秒前
2秒前
3秒前
缥缈浩然发布了新的文献求助10
3秒前
3秒前
GQ完成签到,获得积分10
3秒前
英姑应助宇文老九采纳,获得10
3秒前
FashionBoy应助机智寻雪采纳,获得10
3秒前
Lucas应助林雪采纳,获得10
3秒前
大模型应助安静的皮皮虾采纳,获得10
3秒前
末世寻光发布了新的文献求助10
3秒前
3秒前
传奇3应助fairy采纳,获得10
3秒前
actor2006完成签到,获得积分10
4秒前
量子星尘发布了新的文献求助10
4秒前
啊萍发布了新的文献求助10
4秒前
Zzsfe163发布了新的文献求助30
4秒前
4秒前
HUYAOWEI发布了新的文献求助10
5秒前
zxy发布了新的文献求助10
5秒前
麻果发布了新的文献求助10
5秒前
5秒前
山野下发布了新的文献求助10
6秒前
田様应助猪猪侠采纳,获得10
6秒前
6秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Binary Alloy Phase Diagrams, 2nd Edition 8000
Comprehensive Methanol Science Production, Applications, and Emerging Technologies 2000
Building Quantum Computers 800
Translanguaging in Action in English-Medium Classrooms: A Resource Book for Teachers 700
Exosomes Pipeline Insight, 2025 500
Red Book: 2024–2027 Report of the Committee on Infectious Diseases 500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5654815
求助须知:如何正确求助?哪些是违规求助? 4795608
关于积分的说明 15070611
捐赠科研通 4813367
什么是DOI,文献DOI怎么找? 2575101
邀请新用户注册赠送积分活动 1530574
关于科研通互助平台的介绍 1489178