计算机科学
适配器(计算)
分割
人工智能
变压器
测距
合成孔径雷达
计算机视觉
模式识别(心理学)
计算机工程
计算机硬件
电信
物理
量子力学
电压
作者
Xinyang Pu,Hecheng Jia,Linghao Zheng,Feng Wang,Feng Xu
出处
期刊:Cornell University - arXiv
日期:2024-01-01
被引量:5
标识
DOI:10.48550/arxiv.2401.02326
摘要
In the realm of artificial intelligence, the emergence of foundation models, backed by high computing capabilities and extensive data, has been revolutionary. Segment Anything Model (SAM), built on the Vision Transformer (ViT) model with millions of parameters and vast training dataset SA-1B, excels in various segmentation scenarios relying on its significance of semantic information and generalization ability. Such achievement of visual foundation model stimulates continuous researches on specific downstream tasks in computer vision. The ClassWise-SAM-Adapter (CWSAM) is designed to adapt the high-performing SAM for landcover classification on space-borne Synthetic Aperture Radar (SAR) images. The proposed CWSAM freezes most of SAM's parameters and incorporates lightweight adapters for parameter efficient fine-tuning, and a classwise mask decoder is designed to achieve semantic segmentation task. This adapt-tuning method allows for efficient landcover classification of SAR images, balancing the accuracy with computational demand. In addition, the task specific input module injects low frequency information of SAR images by MLP-based layers to improve the model performance. Compared to conventional state-of-the-art semantic segmentation algorithms by extensive experiments, CWSAM showcases enhanced performance with fewer computing resources, highlighting the potential of leveraging foundational models like SAM for specific downstream tasks in the SAR domain. The source code is available at: https://github.com/xypu98/CWSAM.
科研通智能强力驱动
Strongly Powered by AbleSci AI