发电机(电路理论)
计算机科学
鉴别器
对抗制
安全性令牌
任务(项目管理)
人工智能
强化学习
光学(聚焦)
生成语法
自然语言处理
机器学习
计算机安全
工程类
电信
功率(物理)
物理
光学
量子力学
探测器
系统工程
作者
Xuemiao Zhang,Junfei Liu
标识
DOI:10.1109/bibm58861.2023.10385588
摘要
Medical Image Report Generation aims to automatically generate medical reports based on radiology images, thus freeing radiologists from the tedious task of writing reports. Generating report texts that match the content of a given medical image, focus on local anomalies, and fluently conform to professional reporting norms presents the challenge of linking visual patterns to informative human-language descriptions. In this paper, we propose a novel generative adversarial framework (MIRGAN) to guide the generator to generate medical reports that are indistinguishable from those written by professional radiologists. MIRGAN introduces a multimodal discriminator to evaluate the performance of the report generator on the comprehensive objective containing three sub-objectives when generating each token. MIRGAN then uses it as the reward in reinforcement learning to guide the generator to optimize toward the desired objective in real time. We conduct sufficient experiments to evaluate MIRGAN on two widely used datasets for chest-related diseases. The experimental results show that MIRGAN can significantly improve the performance of the generator on most metrics.
科研通智能强力驱动
Strongly Powered by AbleSci AI