聚类分析
矩阵分解
非负矩阵分解
计算机科学
人工智能
相似性(几何)
线性子空间
相似性学习
秩(图论)
数据挖掘
模式识别(心理学)
子空间拓扑
特征向量
无监督学习
机器学习
数学
特征向量
图像(数学)
物理
组合数学
量子力学
几何学
作者
Chuan‐Yuan Wang,Ying-Lian Gao,Xiang-Zhen Kong,Jin‐Xing Liu,Chun-Hou Zheng
标识
DOI:10.1109/jbhi.2021.3091506
摘要
The development of single-cell RNA sequencing (scRNA-seq) technology has made it possible to measure gene expression levels at the resolution of a single cell, which further reveals the complex growth processes of cells such as mutation and differentiation. Recognizing cell heterogeneity is one of the most critical tasks in scRNA-seq research. To solve it, we propose a non-negative matrix factorization framework based on multi-subspace cell similarity learning for unsupervised scRNA-seq data analysis (MscNMF). MscNMF includes three parts: data decomposition, similarity learning, and similarity fusion. The three work together to complete the data similarity learning task. MscNMF can learn the gene features and cell features of different subspaces, and the correlation and heterogeneity between cells will be more prominent in multi-subspaces. The redundant information and noise in each low-dimensional feature space are eliminated, and its gene weight information can be further analyzed to calculate the optimal number of subpopulations. The final cell similarity learning will be more satisfactory due to the fusion of cell similarity information in different subspaces. The advantage of MscNMF is that it can calculate the number of cell types and the rank of Non-negative matrix factorization (NMF) reasonably. Experiments on eight real scRNA-seq datasets show that MscNMF can effectively perform clustering tasks and extract useful genetic markers. To verify its clustering performance, the framework is compared with other latest clustering algorithms and satisfactory results are obtained. The code of MscNMF is free available for academic (https://github.com/wangchuanyuan1/project-MscNMF).
科研通智能强力驱动
Strongly Powered by AbleSci AI