计算机科学
可扩展性
计算生物学
水准点(测量)
系统生物学
模式
数据挖掘
理论计算机科学
生物
数据库
社会科学
大地测量学
社会学
地理
作者
Yinan Shi,Yanchi Su,Yue Cheng,Ka‐Chun Wong,Yunhe Wang,Xiangtao Li
标识
DOI:10.1002/advs.202509247
摘要
Abstract Single‐cell multi‐omics technologies are pivotal for deciphering the complexities of biological systems, with Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE‐seq) emerging as a particularly valuable approach. The dual‐modality capability makes CITE‐seq particularly advantageous for dissecting cellular heterogeneity and understanding the dynamic interplay between transcriptomic and proteomic landscapes. However, existing computational models for integrating these two modalities often struggle to capture the complex, non‐linear interactions between RNA and antibody‐derived tags (ADTs), and are computationally intensive. To address these issues, scMHVA, a novel and lightweight framework designed to integrate the diverse modalities of CITE‐seq data, is proposed. scMHVA utilizes an adaptive dynamic synthesis module to generate consolidated yet heterogeneous embeddings from RNA and ADT modalities. Subsequently, scMHVA enhances inter‐modality correlations within the joint representation by applying a multi‐head self‐attention mechanism, effectively capturing the intricate mapping relationships between mRNA expression levels and protein abundance. Extensive experiments demonstrate that scMHVA consistently outperformed existing single‐modal and multi‐modal clustering methods across CITE‐seq datasets of varying scales, exhibiting linear runtime scalability and effectively eliminating batch effects, thereby establishing it as a robust tool for large‐scale CITE‐seq data analysis. Additionally, it is demonstrated that scMHVA successfully annotates different cell types in a published mouse thymocyte dataset and reveals dynamics of immune cell development.
科研通智能强力驱动
Strongly Powered by AbleSci AI