自然(考古学)
分割
计算机视觉
编码器
人工智能
图像(数学)
计算机科学
图像分割
地理
操作系统
考古
作者
Xinyu Xiong,Zihuang Wu,Shuangyi Tan,Wenxue Li,Feilong Tang,Ying Chen,Siying Li,Jie Ma,Guanbin Li
出处
期刊:Cornell University - arXiv
日期:2024-08-16
被引量:15
标识
DOI:10.48550/arxiv.2408.08870
摘要
Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}.
科研通智能强力驱动
Strongly Powered by AbleSci AI