计算机科学
人口
马尔科夫蒙特卡洛
人工神经网络
非线性系统
特征选择
一致性(知识库)
蒙特卡罗方法
选择(遗传算法)
数学优化
机器学习
算法
人工智能
贝叶斯概率
数学
统计
物理
社会学
人口学
量子力学
作者
Faming Liang,Qizhai Li,Lei Zhou
标识
DOI:10.1080/01621459.2017.1409122
摘要
Recent advances in high-throughput biotechnologies have provided an unprecedented opportunity for biomarker discovery, which, from a statistical point of view, can be cast as a variable selection problem. This problem is challenging due to the high-dimensional and nonlinear nature of omics data and, in general, it suffers three difficulties: (i) an unknown functional form of the nonlinear system, (ii) variable selection consistency, and (iii) high-demanding computation. To circumvent the first difficulty, we employ a feed-forward neural network to approximate the unknown nonlinear function motivated by its universal approximation ability. To circumvent the second difficulty, we conduct structure selection for the neural network, which induces variable selection, by choosing appropriate prior distributions that lead to the consistency of variable selection. To circumvent the third difficulty, we implement the population stochastic approximation Monte Carlo algorithm, a parallel adaptive Markov Chain Monte Carlo algorithm, on the OpenMP platform that provides a linear speedup for the simulation with the number of cores of the computer. The numerical results indicate that the proposed method can work very well for identification of relevant variables for high-dimensional nonlinear systems. The proposed method is successfully applied to identification of the genes that are associated with anticancer drug sensitivities based on the data collected in the cancer cell line encyclopedia study. Supplementary materials for this article are available online.
科研通智能强力驱动
Strongly Powered by AbleSci AI