协变量
一致性(知识库)
条件依赖
计算机科学
相关性
排名(信息检索)
数据挖掘
集合(抽象数据类型)
机器学习
统计
数学
人工智能
几何学
程序设计语言
作者
Hengjian Cui,Yanyan Liu,Guangcai Mao,Jing Zhang
标识
DOI:10.1002/bimj.202200089
摘要
Abstract How to select the active variables that have significant impact on the event of interest is a very important and meaningful problem in the statistical analysis of ultrahigh‐dimensional data. In many applications, researchers often know that a certain set of covariates are active variables from some previous investigations and experiences. With the knowledge of the important prior knowledge of active variables, we propose a model‐free conditional screening procedure for ultrahigh dimensional survival data based on conditional distance correlation. The proposed procedure can effectively detect the hidden active variables that are jointly important but are weakly correlated with the response. Moreover, it performs well when covariates are strongly correlated with each other. We establish the sure screening property and the ranking consistency of the proposed method and conduct extensive simulation studies, which suggests that the proposed procedure works well for practical situations. Then, we illustrate the new approach through a real dataset from the diffuse large‐B‐cell lymphoma study S1 .
科研通智能强力驱动
Strongly Powered by AbleSci AI