计算机科学
背景(考古学)
任务(项目管理)
虚假关系
人工智能
特征(语言学)
任务分析
机器学习
班级(哲学)
蒸馏
人机交互
模式识别(心理学)
自然语言处理
经济
管理
化学
有机化学
古生物学
哲学
生物
语言学
作者
Shuangchun Gui,Zhenkun Wang,Chen Ji-xiang,Xun Zhou,Chen Zhang,Yi Cao
标识
DOI:10.1109/tmi.2023.3345736
摘要
The recognition of surgical triplets plays a critical role in the practical application of surgical videos. It involves the sub-tasks of recognizing instruments, verbs, and targets, while establishing precise associations between them. Existing methods face two significant challenges in triplet recognition: 1) the imbalanced class distribution of surgical triplets may lead to spurious task association learning, and 2) the feature extractors cannot reconcile local and global context modeling. To overcome these challenges, this paper presents a novel multi-teacher knowledge distillation framework for multi-task triplet learning, known as MT4MTL-KD. MT4MTL-KD leverages teacher models trained on less imbalanced sub-tasks to assist multi-task student learning for triplet recognition. Moreover, we adopt different categories of backbones for the teacher and student models, facilitating the integration of local and global context modeling. To further align the semantic knowledge between the triplet task and its sub-tasks, we propose a novel feature attention module (FAM). This module utilizes attention mechanisms to assign multi-task features to specific sub-tasks. We evaluate the performance of MT4MTL-KD on both the 5-fold cross-validation and the CholecTriplet challenge splits of the CholecT45 dataset. The experimental results consistently demonstrate the superiority of our framework over state-of-the-art methods, achieving significant improvements of up to 6.4% on the cross-validation split.
科研通智能强力驱动
Strongly Powered by AbleSci AI