计算机科学
可扩展性
聚类分析
数据集成
代表(政治)
计算生物学
系统生物学
模式
搜索引擎索引
数据类型
数据挖掘
理论计算机科学
稳健性(进化)
外部数据表示
双聚类
分布式计算
光谱聚类
工作流程
层次聚类
合成生物学
人工智能
机器学习
生物学数据
注释
数据共享
作者
Yinan Shi,Yanchi Su,Yue Cheng,Ka‐Chun Wong,Yunhe Wang,Xiangtao Li
标识
DOI:10.1002/advs.202509247
摘要
Single-cell multi-omics technologies are pivotal for deciphering the complexities of biological systems, with Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) emerging as a particularly valuable approach. The dual-modality capability makes CITE-seq particularly advantageous for dissecting cellular heterogeneity and understanding the dynamic interplay between transcriptomic and proteomic landscapes. However, existing computational models for integrating these two modalities often struggle to capture the complex, non-linear interactions between RNA and antibody-derived tags (ADTs), and are computationally intensive. To address these issues, scMHVA, a novel and lightweight framework designed to integrate the diverse modalities of CITE-seq data, is proposed. scMHVA utilizes an adaptive dynamic synthesis module to generate consolidated yet heterogeneous embeddings from RNA and ADT modalities. Subsequently, scMHVA enhances inter-modality correlations within the joint representation by applying a multi-head self-attention mechanism, effectively capturing the intricate mapping relationships between mRNA expression levels and protein abundance. Extensive experiments demonstrate that scMHVA consistently outperformed existing single-modal and multi-modal clustering methods across CITE-seq datasets of varying scales, exhibiting linear runtime scalability and effectively eliminating batch effects, thereby establishing it as a robust tool for large-scale CITE-seq data analysis. Additionally, it is demonstrated that scMHVA successfully annotates different cell types in a published mouse thymocyte dataset and reveals dynamics of immune cell development.
科研通智能强力驱动
Strongly Powered by AbleSci AI