计算机科学
知识库
嵌入
情态动词
特征(语言学)
人工智能
编码(集合论)
情报检索
放射科
自然语言处理
机器学习
医学物理学
医学
程序设计语言
语言学
哲学
集合(抽象数据类型)
化学
高分子化学
作者
Shuxin Yang,Xian Wu,Shen Ge,Zhuozhao Zheng,S. Kevin Zhou,Li Xiao
标识
DOI:10.1016/j.media.2023.102798
摘要
In clinics, a radiology report is crucial for guiding a patient's treatment. However, writing radiology reports is a heavy burden for radiologists. To this end, we present an automatic, multi-modal approach for report generation from a chest x-ray. Our approach, motivated by the observation that the descriptions in radiology reports are highly correlated with specific information of the x-ray images, features two distinct modules: (i) Learned knowledge base: To absorb the knowledge embedded in the radiology reports, we build a knowledge base that can automatically distill and restore medical knowledge from textual embedding without manual labor; (ii) Multi-modal alignment: to promote the semantic alignment among reports, disease labels, and images, we explicitly utilize textual embedding to guide the learning of the visual feature space. We evaluate the performance of the proposed model using metrics from both natural language generation and clinic efficacy on the public IU-Xray and MIMIC-CXR datasets. Our ablation study shows that each module contributes to improving the quality of generated reports. Furthermore, the assistance of both modules, our approach outperforms state-of-the-art methods over almost all the metrics. Code is available at https://github.com/LX-doctorAI1/M2KT.
科研通智能强力驱动
Strongly Powered by AbleSci AI