可扩展性
基础(证据)
地图集(解剖学)
生物
基因组
计算生物学
计算机科学
遗传学
解剖
基因
考古
地理
操作系统
作者
Arthur Huang,Tsung-Han S. Hsieh,Jiang Zhu,Jackson Michuda,Andy Teng,Soohong Kim,Elizabeth M Rumsey,Suzanne Lam,Ikenna Anigbogu,Paul S. Wright,Mohamed Ameen,Kwontae You,Carole Graves,Hyunsung John Kim,Adam J. Litterman,Rene Sit,Alex Blocker,Ci Chu
标识
DOI:10.1101/2025.06.11.659105
摘要
The rapid expansion of massively parallel sequencing technologies has enabled the development of foundation models to uncover novel biological findings. While these have the potential to significantly accelerate scientific discoveries by creating AI-driven virtual cell models, their progress has been greatly limited by the lack of large-scale high-quality perturbation data, which remains constrained due to scalability bottlenecks and assay variability. Here, we introduce "Fix-Cryopreserve-ScRNAseq" (FiCS) Perturb-seq, an industrialized platform for scalable Perturb-seq data generation. We demonstrate that FiCS Perturb-seq exhibits high sensitivity and low batch effects, effectively capturing perturbation-induced transcriptomic changes and recapitulating known biological pathways and protein complexes. In addition, we release X-Atlas: Orion edition (X-Atlas/Orion), the largest publicly available Perturb-seq atlas. This atlas, generated from two genome-wide FiCS Perturb-seq experiments targeting all human protein-coding genes, comprises eight million cells deeply sequenced to over 16,000 unique molecular identifiers (UMIs) per cell. Furthermore, we show that single guide RNA (sgRNA) abundance can serve as a proxy for gene knockdown (KD) efficacy. Leveraging the deep sequencing and substantial cell numbers per perturbation, we also show that stratification by sgRNA expression can reveal dose-dependent genetic effects. Taken together, we demonstrate that FiCS Perturb-seq is an efficient and scalable platform for high-throughput Perturb-seq screens. Through the release of X-Atlas/Orion, we highlight the potential of FiCS Perturb-seq to address current scalability and variability challenges in data generation, advance foundation model development that incorporates gene-dosage effects, and accelerate biological discoveries.
科研通智能强力驱动
Strongly Powered by AbleSci AI