计算机科学
可扩展性
注释
深度学习
管道(软件)
人工智能
生物医学
机器学习
计算生物学
数据挖掘
生物信息学
生物
数据库
程序设计语言
作者
Chuanyang Zheng,Yixuan Wang,Yuqi Cheng,Xuesong Wang,Hongxin Wei,Irwin King,Yu Li
摘要
Single-cell RNA sequencing has achieved massive success in biological research fields. Discovering novel cell types from single-cell transcriptomics has been demonstrated to be essential in the field of biomedicine, yet is time-consuming and needs prior knowledge. With the unprecedented boom in cell atlases, auto-annotation tools have become more prevalent due to their speed, accuracy and user-friendly features. However, existing tools have mostly focused on general cell-type annotation and have not adequately addressed the challenge of discovering novel rare cell types. In this work, we introduce scNovel, a powerful deep learning-based neural network that specifically focuses on novel rare cell discovery. By testing our model on diverse datasets with different scales, protocols and degrees of imbalance, we demonstrate that scNovel significantly outperforms previous state-of-the-art novel cell detection models, reaching the most AUROC performance(the only one method whose averaged AUROC results are above 94%, up to 16.26% more comparing to the second-best method). We validate scNovel's performance on a million-scale dataset to illustrate the scalability of scNovel further. Applying scNovel on a clinical COVID-19 dataset, three potential novel subtypes of Macrophages are identified, where the COVID-related differential genes are also detected to have consistent expression patterns through deeper analysis. We believe that our proposed pipeline will be an important tool for high-throughput clinical data in a wide range of applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI