插补(统计学)
缺少数据
广义线性模型
Lasso(编程语言)
特征选择
数学
计算机科学
统计
线性模型
估计方程
广义估计方程
数据挖掘
算法
计量经济学
最大似然
人工智能
万维网
作者
Yang Li,Haoyu Yang,Haochen Yu,Hanwen Huang,Ye Shen
摘要
Missing values among variables present a challenge in variable selection in the generalized linear model. Common strategies that delete observations with missing information may cause serious information loss. Multiple imputation has been widely used in recent years because it provides unbiased statistical results given a correctly specified imputation model and considers the uncertainty of the missing data. However, variable selection methods in the generalized linear model with multiply-imputed data have not yet been studied widely. In this study, we introduce penalized estimating equations for generalized linear models with multiple imputation (PEE–MI), which incorporates the correlation of multiple imputed observations into the objective function. The theoretical performance of the proposed PEE–MI depends on the penalized function adopted. We use the adaptive least absolute shrinkage and selection operator (adaptive LASSO) as an illustrating example. Simulations show that PEE–MI outperforms the alternatives. The proposed method is shown to select variables with clinical relevance when applied to a database of laboratory-diagnosed A/H7N9 patients in the Zhejiang province, China.
科研通智能强力驱动
Strongly Powered by AbleSci AI