NMFLRR: Clustering scRNA-Seq Data by Integrating Nonnegative Matrix Factorization With Low Rank Representation

计算机科学矩阵分解代表（政治）非负矩阵分解聚类分析低秩近似双聚类模式识别（心理学）数据挖掘基质（化学分析）秩（图论）人工智能数学模糊聚类 CURE数据聚类算法特征向量法学化学组合数学数学分析物理政治量子力学色谱法汉克尔矩阵政治学

作者

Wei Zhang,Xiaoli Xue,Xiaoying Zheng,Zizhu Fan

出处

期刊：IEEE Journal of Biomedical and Health Informatics [Institute of Electrical and Electronics Engineers]
日期：2021-07-26 卷期号：26 (3): 1394-1405 被引量：23

链接

nih.govdoi.org

标识

DOI：10.1109/jbhi.2021.3099127

摘要

Fast-developing single-cell technologies create unprecedented opportunities to reveal cell heterogeneity and diversity. Accurate classification of single cells is a critical prerequisite for recovering the mechanisms of heterogeneity. However, the scRNA-seq profiles we obtained at present have high dimensionality, sparsity, and noise, which pose challenges for existing clustering methods in grouping cells that belong to the same subpopulation based on transcriptomic profiles. Although many computational methods have been proposed developing novel and effective computational methods to accurately identify cell types remains a considerable challenge. We present a new computational framework to identify cell types by integrating low-rank representation (LRR) and nonnegative matrix factorization (NMF); this framework is named NMFLRR. The LRR captures the global properties of original data by using nuclear norms, and a locality constrained graph regularization term is introduced to characterize the data's local geometric information. The similarity matrix and low-dimensional features of data can be simultaneously obtained by applying the alternating direction method of multipliers (ADMM) algorithm to handle each variable alternatively in an iterative way. We finally obtained the predicted cell types by using a spectral algorithm based on the optimized similarity matrix. Nine real scRNA-seq datasets were used to test the performance of NMFLRR and fifteen other competitive methods, and the accuracy and robustness of the simulation results suggest the NMFLRR is a promising algorithm for the classification of single cells. The simulation code is freely available at: https://github.com/wzhangwhu/NMFLRR_code.

求助该文献

最长约 10秒，即可获得该文献文件

NMFLRR: Clustering scRNA-Seq Data by Integrating Nonnegative Matrix Factorization With Low Rank Representation

今日热心研友