插补(统计学)
单细胞测序
缺少数据
计算机科学
计算生物学
数据挖掘
聚类分析
秩(图论)
生物
数学
基因
人工智能
机器学习
表型
外显子组测序
遗传学
组合数学
作者
Zhijie Ni,Xiaoying Zheng,Xiao Zheng,Xiufen Zou
标识
DOI:10.1109/tcbb.2020.3025804
摘要
With the successful application of single-cell sequencing technology, a large number of single-cell multi-omics sequencing (scMO-seq)data have been generated, which enables researchers to study heterogeneity between individual cells. One prominent problem in single-cell data analysis is the prevalence of dropouts, caused by failures in amplification during the experiments. It is necessary to develop effective approaches for imputing the missing values. Different with general methods imputing single type of single-cell data, we propose an imputation method called scLRTD, using low-rank tensor decomposition based on nuclear norm to impute scMO-seq data and single-cell RNA-sequencing (scRNA-seq)data with different stages, tissues or conditions. Furthermore, four sets of simulated and two sets of real scRNA-seq data from mouse embryonic stem cells and hepatocellular carcinoma, respectively, are used to carry out numerical experiments and compared with other six published methods. Error accuracy and clustering results demonstrate the effectiveness of proposed method. Moreover, we clearly identify two cell subpopulations after imputing the real scMO-seq data from hepatocellular carcinoma. Further, Gene Ontology identifies 7 genes in Bile secretion pathway, which is related to metabolism in hepatocellular carcinoma. The survival analysis using the database TCGA also show that two cell subpopulations after imputing have distinguished survival rates.
科研通智能强力驱动
Strongly Powered by AbleSci AI