胚胎
细胞
数据集
计算生物学
空间分析
生物
基因
空间分布
原位杂交
骨肉瘤
计算机科学
集合(抽象数据类型)
基因表达
人工智能
细胞生物学
遗传学
癌症研究
数学
统计
程序设计语言
作者
Mengbo Zhu,Changjun Li,Kebo Lv,Hongzhe Guo,Rui Hou,Geng Tian,Jialiang Yang
标识
DOI:10.1016/j.compbiomed.2023.106873
摘要
Single-cell RNA sequencing (scRNA-seq) technologies allow us to interrogate the state of an individual cell within its microenvironment. However, prior to sequencing, cells should be dissociated first, making it difficult to obtain their spatial information. Since the spatial distribution of cells is critical in a few circumstances such as cancer immunotherapy, we present MLSpatial, a novel computational method to learn the relationship between gene expression patterns and spatial locations of cells, and then predict cell-to-cell distance distribution based on scRNA-seq data alone.We collected the drosophila embryo dataset, which contains both the fluorescence in situ hybridization (FISH) data and single cell RNA-seq (scRNA-seq) data of drosophila embryo. The FISH data provided the spatial position of 3039 cells and the expression of 84 genes for each cell. The scRNA-seq data contains the expressions of 8924 genes in 1297 high-quality cells with cell location unknown. For a comparison, we also collected the MERFISH data of 645 osteosarcoma cells with cell location and the expression status of 10,050 genes known. For each data, the cells were randomly divided into a training set and a test set, in the ratio of 7:3. The cell-to-cell distances our model extracted had a higher correspondence (i.e., correlation coefficient 0.99) with those of the real situation than those of existing methods in the FISH data of drosophila embryo. However, in the osteosarcoma data, our model captured the spatial relationship between cells, with a correlation of 0.514 to that of the real situation. We also applied the model trained using the FISH data of drosophila embryo into the single cell data of drosophila embryo, for which the real location of cells are unknown. The reconstructed pseudo drosophila embryo and the real embryo (as shown by the FISH data) had a high similarity in the spatial distribution of gene expression.MLSpatial can accurately restore the relative position of cells from scRNA-seq data; however, the performance depends on the type of cells. The trained model might be useful in reconstructing the spatial distributions of single cells with only scRNA-seq data, provided that the scRNA-seq data and the FISH data are under similar background (i.e., the same tissue with similar disease background).
科研通智能强力驱动
Strongly Powered by AbleSci AI