病态的
生物
欧几里德距离
主成分分析
度量(数据仓库)
计算生物学
转录组
人工智能
模式识别(心理学)
计算机科学
生物信息学
空分布
空(SQL)
参考数据
距离测量
数据挖掘
无效假设
职位(财务)
统计假设检验
统计模型
相关性
病理
细胞
临床实习
表(数据库)
组织样品
作者
Samin Rahman Khan,M Saifur Rahman,M Saifur Rahman,Md. Abul Hassan Samee
标识
DOI:10.1101/gr.280411.125
摘要
The surge in single-cell data sets and reference atlases has enabled the comparison of cell states across conditions, yet a gap persists in quantifying pathological shifts from healthy cell states. To address this gap, we introduce s ingle- c ell P athological S hift S coring (scPSS), which provides a statistical measure for how much a “query” cell from a diseased sample has shifted away from a reference group of healthy cells. In scPSS, the distance of a cell to its k -th nearest reference cell is considered as its pathological shift score. Euclidean distances in the top n principal component space of the gene expressions are used to measure distances between cells. The distribution of shift scores of the reference cells forms a null model. This allows a P -value to be assigned to each query cell's shift score, quantifying its statistical significance of being in the reference cell group. This makes our method both simple and statistically rigorous. The key strength of scPSS is its applicability in a “semisupervised” setting, where only healthy reference cells are known and diseased-labeled data are not provided for model training. As existing methods do not support cell-level pathological progression measurement in this setting, we adapt state-of-the-art supervised pathological prediction and contrastive models for benchmarking. Comparative evaluations against these adapted models demonstrate our method's superiority in accuracy and efficiency. Additionally, we show that the aggregation of cell-level pathological scores from scPSS can be used to predict health conditions at the individual level.
科研通智能强力驱动
Strongly Powered by AbleSci AI