A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis

基础(证据) 病理 癌症 计算机科学 医学 自然语言处理 历史 内科学 考古
作者
Xiao Zhou,Luoyi Sun,Da He,Wenbin Guan,Ruifen Wang,Lifeng Wang,Xin Sun,Kun Sun,Ya Zhang,Yanfeng Wang,Weidi Xie
出处
期刊:Cornell University - arXiv
标识
DOI:10.48550/arxiv.2412.13126
摘要

Deep learning has enabled the development of highly robust foundation models for various pathological tasks across diverse diseases and patient cohorts. Among these models, vision-language pre-training, which leverages large-scale paired data to align pathology image and text embedding spaces, and provides a novel zero-shot paradigm for downstream tasks. However, existing models have been primarily data-driven and lack the incorporation of domain-specific knowledge, which limits their performance in cancer diagnosis, especially for rare tumor subtypes. To address this limitation, we establish a Knowledge-enhanced Pathology (KEEP) foundation model that harnesses disease knowledge to facilitate vision-language pre-training. Specifically, we first construct a disease knowledge graph (KG) that covers 11,454 human diseases with 139,143 disease attributes, including synonyms, definitions, and hypernym relations. We then systematically reorganize the millions of publicly available noisy pathology image-text pairs, into 143K well-structured semantic groups linked through the hierarchical relations of the disease KG. To derive more nuanced image and text representations, we propose a novel knowledge-enhanced vision-language pre-training approach that integrates disease knowledge into the alignment within hierarchical semantic groups instead of unstructured image-text pairs. Validated on 18 diverse benchmarks with more than 14,000 whole slide images (WSIs), KEEP achieves state-of-the-art performance in zero-shot cancer diagnostic tasks. Notably, for cancer detection, KEEP demonstrates an average sensitivity of 89.8% at a specificity of 95.0% across 7 cancer types. For cancer subtyping, KEEP achieves a median balanced accuracy of 0.456 in subtyping 30 rare brain cancers, indicating strong generalizability for diagnosing rare tumors.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
jackone发布了新的文献求助10
1秒前
Yuki发布了新的文献求助10
1秒前
2秒前
wangting完成签到,获得积分10
2秒前
4秒前
一叶知秋发布了新的文献求助10
4秒前
5秒前
麻果应助李瑞采纳,获得10
5秒前
浅笑宝宝发布了新的文献求助10
7秒前
焦立超发布了新的文献求助10
9秒前
10秒前
sleep完成签到,获得积分10
10秒前
cc完成签到,获得积分10
11秒前
安婷fly发布了新的文献求助10
11秒前
pluto应助lavie采纳,获得10
12秒前
jackone完成签到,获得积分10
12秒前
hhhhh应助康心采纳,获得10
13秒前
顾矜应助Yuki采纳,获得10
13秒前
Zing发布了新的文献求助10
14秒前
16秒前
填充物完成签到 ,获得积分10
16秒前
16秒前
涂山路发布了新的文献求助30
17秒前
小章鱼完成签到,获得积分10
17秒前
yy完成签到,获得积分10
17秒前
18秒前
JUNJUN发布了新的文献求助100
19秒前
shimmer完成签到,获得积分20
19秒前
苏灿应助你呀你呀采纳,获得10
20秒前
昆明官渡酒店应助hzs采纳,获得10
20秒前
SYLH应助Yu采纳,获得10
21秒前
杨家辉发布了新的文献求助10
21秒前
21秒前
22秒前
sia完成签到,获得积分10
24秒前
Zing完成签到,获得积分10
27秒前
raymond完成签到,获得积分10
28秒前
土豆大魔王完成签到,获得积分10
28秒前
Jayjay发布了新的文献求助10
28秒前
高分求助中
A new approach to the extrapolation of accelerated life test data 1000
ACSM’s Guidelines for Exercise Testing and Prescription, 12th edition 500
‘Unruly’ Children: Historical Fieldnotes and Learning Morality in a Taiwan Village (New Departures in Anthropology) 400
Indomethacinのヒトにおける経皮吸収 400
Phylogenetic study of the order Polydesmida (Myriapoda: Diplopoda) 370
基于可调谐半导体激光吸收光谱技术泄漏气体检测系统的研究 330
Composite Predicates in English 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 3982424
求助须知:如何正确求助?哪些是违规求助? 3526056
关于积分的说明 11230222
捐赠科研通 3263911
什么是DOI,文献DOI怎么找? 1801722
邀请新用户注册赠送积分活动 879994
科研通“疑难数据库(出版商)”最低求助积分说明 807767