A Multimodal Biomedical Foundation Model Trained from Fifteen Million Image–Text Pairs

基础(证据) 图像(数学) 计算机科学 人工智能 情报检索 计算机视觉 历史 考古
作者
Sheng Zhang,Yanbo Xu,Naoto Usuyama,Hanwen Xu,Jaspreet Bagga,Robert Tinn,Sam Preston,Rajesh Rao,Mu Wei,Naveen Valluri,Cliff Wong,Andrea Tupini,Yu Wang,Matt Mazzola,Swadheen Shukla,Lars Lidén,Jianfeng Gao,Angela Crabtree,Brian Piening,Carlo Bifulco
标识
DOI:10.1056/aioa2400640
摘要

BackgroundBiomedical data are inherently multimodal, comprising physical measurements and natural-language narratives. A generalist biomedical artificial intelligence (AI) model needs to simultaneously process different modalities of data, including text and images. Therefore, training an effective generalist biomedical model requires high-quality multimodal data, such as parallel image–text pairs.MethodsHere, we present PMC-15M, a novel dataset that is two orders of magnitude larger than existing biomedical multimodal datasets, such as MIMIC-CXR, and spans a diverse range of biomedical image types. PMC-15M contains 15 million biomedical image–text pairs collected from 4.4 million scientific articles. Based on PMC-15M, we have pretrained BiomedCLIP, a multimodal foundation model, with domain-specific adaptations tailored to biomedical vision–language processing.ResultsWe conducted extensive experiments and ablation studies on standard biomedical imaging tasks from retrieval to classification to visual question answering (VQA). BiomedCLIP achieved new state-of-the-art results in a wide range of standard datasets, substantially outperforming prior approaches. Intriguingly, by large-scale pretraining on diverse biomedical image types, BiomedCLIP even outperforms state-of-the-art radiology-specific models, such as BioViL, in radiology-specific tasks such as Radiological Society of North America (RSNA) pneumonia detection.ConclusionsBiomedCLIP is a fully open-access foundation model that achieves state-of-the-art performance on various biomedical tasks, paving the way for transformative multimodal biomedical discovery and applications. We release our models at aka.ms/biomedclip to facilitate future research in multimodal biomedical AI.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
搜集达人应助ppat5012采纳,获得10
刚刚
1秒前
调皮惜天发布了新的文献求助10
1秒前
summer完成签到,获得积分10
2秒前
2秒前
2秒前
zz完成签到,获得积分10
2秒前
2秒前
2秒前
欧阳X天完成签到,获得积分10
3秒前
3秒前
3秒前
我爱物理完成签到,获得积分10
4秒前
7199完成签到,获得积分10
5秒前
小雨发布了新的文献求助10
6秒前
欢喜怀蝶发布了新的文献求助10
6秒前
小狒狒发布了新的文献求助10
7秒前
田様应助诚心的雪瑶采纳,获得10
7秒前
fan发布了新的文献求助10
8秒前
8秒前
乘风破浪完成签到,获得积分10
8秒前
jeb关注了科研通微信公众号
8秒前
7199发布了新的文献求助10
9秒前
珂珂可可完成签到,获得积分10
9秒前
KYT发布了新的文献求助10
10秒前
清嘉发布了新的文献求助10
10秒前
du完成签到,获得积分20
12秒前
12秒前
科研通AI6.2应助shawn采纳,获得10
14秒前
14秒前
16秒前
科研通AI6.3应助dmm采纳,获得10
16秒前
桐桐应助会飞的喵采纳,获得10
16秒前
ppat5012发布了新的文献求助10
17秒前
研友_VZG7GZ应助王真采纳,获得10
17秒前
chen发布了新的文献求助10
18秒前
19秒前
19秒前
江浙涵涵发布了新的文献求助10
19秒前
努力不延毕cccp完成签到,获得积分10
21秒前
高分求助中
Malcolm Fraser : a biography 700
Signals, Systems, and Signal Processing 610
天津市智库成果选编 600
Climate change and sports: Statistics report on climate change and sports 500
Forced degradation and stability indicating LC method for Letrozole: A stress testing guide 500
Organic Reactions Volume 118 400
A Foreign Missionary on the Long March: The Unpublished Memoirs of Arnolis Hayman of the China Inland Mission 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6465431
求助须知:如何正确求助?哪些是违规求助? 8272420
关于积分的说明 17638041
捐赠科研通 5539652
什么是DOI,文献DOI怎么找? 2907657
邀请新用户注册赠送积分活动 1884755
关于科研通互助平台的介绍 1732248