主成分分析
离群值
降维
计算机科学
维数之咒
人工智能
力矩(物理)
可分离空间
稀疏PCA
数学
稳健主成分分析
模式识别(心理学)
维数(图论)
算法
应用数学
经典力学
物理
数学分析
纯数学
作者
Debolina Paul,Saptarshi Chakraborty,Swagatam Das
标识
DOI:10.1109/tnnls.2023.3298011
摘要
Principal component analysis (PCA) is a fundamental tool for data visualization, denoising, and dimensionality reduction. It is widely popular in statistics, machine learning, computer vision, and related fields. However, PCA is well-known to fall prey to outliers and often fails to detect the true underlying low-dimensional structure within the dataset. Following the Median of Means (MoM) philosophy, recent supervised learning methods have shown great success in dealing with outlying observations without much compromise to their large sample theoretical properties. This article proposes a PCA procedure based on the MoM principle. Called the MoMPCA, the proposed method is not only computationally appealing but also achieves optimal convergence rates under minimal assumptions. In particular, we explore the nonasymptotic error bounds of the obtained solution via the aid of the Rademacher complexities while granting absolutely no assumption on the outlying observations. The derived concentration results are not dependent on the dimension because the analysis is conducted in a separable Hilbert space, and the results only depend on the fourth moment of the underlying distribution in the corresponding norm. The proposal's efficacy is also thoroughly showcased through simulations and real data applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI