HiCur-NPC: Hierarchical Feature Fusion Curriculum Learning for Multi-Modal Foundation Model in Nasopharyngeal Carcinoma

鼻咽癌 情态动词 特征(语言学) 基础(证据) 计算机科学 人工智能 特征提取 医学 放射治疗 放射科 材料科学 语言学 历史 哲学 考古 高分子化学
作者
Zipei Wang,Mengjie Fang,Ling‐Long Tang,Jie Tian,Di Dong
出处
期刊:IEEE Transactions on Medical Imaging [Institute of Electrical and Electronics Engineers]
卷期号:: 1-1
标识
DOI:10.1109/tmi.2025.3558775
摘要

Providing precise and comprehensive diagnostic information to clinicians is crucial for improving the treatment and prognosis of nasopharyngeal carcinoma. Multi-modal foundation models, which can integrate data from various sources, have the potential to significantly enhance clinical assistance. However, several challenges remain: (1) the lack of large-scale visual-language datasets for nasopharyngeal carcinoma; (2) the inability of existing pre-training and fine-tuning methods to capture the hierarchical features required for complex clinical tasks; (3) current foundation models having limited visual perception due to inadequate integration of multi-modal information. While curriculum learning can improve a model's ability to handle multiple tasks through systematic knowledge accumulation, it still lacks consideration for hierarchical features and their dependencies, affecting knowledge gains. To address these issues, we propose the Hierarchical Feature Fusion Curriculum Learning method, which consists of three stages: visual knowledge learning, coarse-grained alignment, and fine-grained fusion. First, we introduce the Hybrid Contrastive Masked Autoencoder to pre-train visual encoders on 755K multi-modal images of nasopharyngeal carcinoma CT, MRI, and endoscopy to fully extract deep visual information. Then, we construct a 65K visual instruction fine-tuning dataset based on open-source data and clinician diagnostic reports, achieving coarse-grained alignment with visual information in a large language model. Finally, we design a Mixture of Experts Cross Attention structure for deep fine-grained fusion of global multi-modal information. Our model outperforms previously developed specialized models in all key clinical tasks for nasopharyngeal carcinoma, including diagnosis, report generation, tumor segmentation, and prognosis.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Sherlock发布了新的文献求助10
2秒前
热心的匕发布了新的文献求助10
2秒前
3秒前
wjy321发布了新的文献求助10
5秒前
卡卡完成签到,获得积分10
5秒前
KAJIKU完成签到,获得积分10
5秒前
4477完成签到,获得积分10
6秒前
大模型应助xiaoze采纳,获得10
7秒前
8秒前
科研通AI5应助小莫采纳,获得10
8秒前
代号富婆完成签到,获得积分10
9秒前
dd发布了新的文献求助10
10秒前
南北发布了新的文献求助30
11秒前
汤雯慧完成签到,获得积分10
12秒前
桐桐应助小梁采纳,获得10
13秒前
赘婿应助Meidina采纳,获得10
13秒前
Sherlock完成签到,获得积分10
13秒前
13秒前
14秒前
15秒前
科研通AI5应助mic采纳,获得10
15秒前
chen完成签到,获得积分20
16秒前
17秒前
xm发布了新的文献求助10
18秒前
青藤发布了新的文献求助10
19秒前
lalala发布了新的文献求助10
19秒前
科研助手6应助哈士奇采纳,获得10
19秒前
xiaoze发布了新的文献求助10
20秒前
斯坦福没有冬天完成签到,获得积分10
23秒前
24142完成签到,获得积分10
24秒前
24秒前
满意的雅阳完成签到,获得积分10
25秒前
甲烷完成签到,获得积分10
25秒前
wanci应助123采纳,获得10
26秒前
KAJIKU发布了新的文献求助10
27秒前
曹文鹏完成签到 ,获得积分10
28秒前
28秒前
cccc完成签到,获得积分10
28秒前
AXIANGGE发布了新的文献求助10
29秒前
chen发布了新的文献求助10
29秒前
高分求助中
Chinesen in Europa – Europäer in China: Journalisten, Spione, Studenten 500
Arthur Ewert: A Life for the Comintern 500
China's Relations With Japan 1945-83: The Role of Liao Chengzhi // Kurt Werner Radtke 500
Two Years in Peking 1965-1966: Book 1: Living and Teaching in Mao's China // Reginald Hunt 500
Epigenetic Drug Discovery 500
Hardness Tests and Hardness Number Conversions 300
Knowledge management in the fashion industry 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3816877
求助须知:如何正确求助?哪些是违规求助? 3360272
关于积分的说明 10407488
捐赠科研通 3078282
什么是DOI,文献DOI怎么找? 1690682
邀请新用户注册赠送积分活动 813990
科研通“疑难数据库(出版商)”最低求助积分说明 767958