主成分分析
人口
稳健性(进化)
子空间拓扑
维数(图论)
非线性系统
数学
数据点
线性模型
模式识别(心理学)
计算机科学
算法
人工智能
统计
生物
组合数学
生物化学
物理
人口学
量子力学
社会学
基因
作者
Anandita De,Rishidev Chaudhuri
标识
DOI:10.1073/pnas.2305853120
摘要
Populations of neurons represent sensory, motor, and cognitive variables via patterns of activity distributed across the population. The size of the population used to encode a variable is typically much greater than the dimension of the variable itself, and thus, the corresponding neural population activity occupies lower-dimensional subsets of the full set of possible activity states. Given population activity data with such lower-dimensional structure, a fundamental question asks how close the low-dimensional data lie to a linear subspace. The linearity or nonlinearity of the low-dimensional structure reflects important computational features of the encoding, such as robustness and generalizability. Moreover, identifying such linear structure underlies common data analysis methods such as Principal Component Analysis (PCA). Here, we show that for data drawn from many common population codes the resulting point clouds and manifolds are exceedingly nonlinear, with the dimension of the best-fitting linear subspace growing at least exponentially with the true dimension of the data. Consequently, linear methods like PCA fail dramatically at identifying the true underlying structure, even in the limit of arbitrarily many data points and no noise.
科研通智能强力驱动
Strongly Powered by AbleSci AI