人工智能
计算机科学
分割
计算机视觉
图像分割
水下
RGB颜色模型
合成孔径雷达
尺度空间分割
特征提取
基于分割的对象分类
模式识别(心理学)
特征(语言学)
卷积神经网络
稳健性(进化)
残余物
深度学习
频域
人工神经网络
合并(版本控制)
作者
Yue Yang,Xiaoyi Feng,Ming Li,Xiangyun Hu,Jiangying Qin,Armin Gruen,Deren Li,Jianya Gong
标识
DOI:10.1109/tgrs.2025.3618269
摘要
Underwater instance segmentation plays a pivotal role in various applications. Among them, coral instance segmentation is of great significance in the fields of marine biology and environmental monitoring, and is crucial for comprehensive understanding of coral reef ecosystems. Traditional methods for underwater instance segmentation predominantly rely on RGB images. However, the complex morphology of corals and strong background interference often result in poor segmentation outcomes. To tackle these problems, this study presents a novel multimodal instance segmentation method, termed UMIS-YOLO, which is grounded in the YOLO architecture. UMIS-YOLO incorporates a dual backbone network design that substantially enhances the feature extraction capabilities for both RGB images and depth images, thereby improving the effectiveness of instance segmentation. At the same time, we propose two innovative plug-and-play modules: the Frequency Domain Feature Enhancement Fusion (FDFEF) module and the Residual Feature Fusion (RFF) module. The FDFEF module leverages Fourier transform to enhance the features of both modalities in the frequency domain, employing learnable weights to enable the complementary integration of amplitude and phase information. While the RFF module utilizes a residual learning strategy to efficiently merge low-level and high-level features prior to the segmentation head, thereby improving pixel-level segmentation accuracy. Additionally, we introduce a challenging high-resolution dataset, UMIS-Coral, which comprises RGB images and depth images captured in complex coral environments. Meanwhile, we expand the depth images for the UIIS dataset to further verify the effectiveness of UMIS-YOLO. The experimental results indicate that the UMIS-YOLO model achieved mAP50 and mAP75 improvements of 2.3 and 3.0 on the UMIS-Coral dataset, as well as 3.9 and 2.8 on the UIIS dataset, respectively. Furthermore, the model is characterized by its lightweight architecture and rapid segmentation capabilities. The source code and the dataset are publicly accessible at https://github.com/zhangsanhulk/UMIS-YOLO.
科研通智能强力驱动
Strongly Powered by AbleSci AI