主成分分析
稀疏PCA
弹性网正则化
Lasso(编程语言)
降维
多元统计
维数之咒
约束(计算机辅助设计)
模式识别(心理学)
计算机科学
人工智能
主成分回归
回归
数学
机器学习
统计
特征选择
万维网
几何学
作者
Hui Zou,Trevor Hastie,Robert Tibshirani
标识
DOI:10.1198/106186006x113430
摘要
Principal component analysis (PCA) is widely used in data processing and dimensionality reduction. However, PCA suffers from the fact that each principal component is a linear combination of all the original variables, thus it is often difficult to interpret the results. We introduce a new method called sparse principal component analysis (SPCA) using the lasso (elastic net) to produce modified principal components with sparse loadings. We first show that PCA can be formulated as a regression-type optimization problem; sparse loadings are then obtained by imposing the lasso (elastic net) constraint on the regression coefficients. Efficient algorithms are proposed to fit our SPCA models for both regular multivariate data and gene expression arrays. We also give a new formula to compute the total variance of modified principal components. As illustrations, SPCA is applied to real and simulated data with encouraging results.
科研通智能强力驱动
Strongly Powered by AbleSci AI