计算机科学
数字化病理学
语义学(计算机科学)
人工智能
图像检索
编码器
情态动词
代表(政治)
源代码
模式识别(心理学)
情报检索
匹配(统计)
图像(数学)
病理
操作系统
政治
化学
高分子化学
程序设计语言
法学
医学
政治学
作者
Dingyi Hu,Zhiguo Jiang,Jun Shi,Fengying Xie,Kun Wu,Kunming Tang,Ming Cao,Jianguo Huai,Yushan Zheng
标识
DOI:10.1016/j.media.2024.103163
摘要
Large-scale digital whole slide image (WSI) datasets analysis have gained significant attention in computer-aided cancer diagnosis. Content-based histopathological image retrieval (CBHIR) is a technique that searches a large database for data samples matching input objects in both details and semantics, offering relevant diagnostic information to pathologists. However, the current methods are limited by the difficulty of gigapixels, the variable size of WSIs, and the dependence on manual annotations. In this work, we propose a novel histopathology language-image representation learning framework for fine-grained digital pathology cross-modal retrieval, which utilizes paired diagnosis reports to learn fine-grained semantics from the WSI. An anchor-based WSI encoder is built to extract hierarchical region features and a prompt-based text encoder is introduced to learn fine-grained semantics from the diagnosis reports. The proposed framework is trained with a multivariate cross-modal loss function to learn semantic information from the diagnosis report at both the instance level and region level. After training, it can perform four types of retrieval tasks based on the multi-modal database to support diagnostic requirements. We conducted experiments on an in-house dataset and a public dataset to evaluate the proposed method. Extensive experiments have demonstrated the effectiveness of the proposed method and its advantages to the present histopathology retrieval methods. The code is available at https://github.com/hudingyi/FGCR.
科研通智能强力驱动
Strongly Powered by AbleSci AI