计算机科学
判别式
分割
编码器
人工智能
变压器
词汇
特征(语言学)
遥感
一般化
遥感应用
特征提取
语义映射
机器学习
土地覆盖
隐藏字幕
语义特征
自然语言处理
特征学习
领域(数学分析)
语义学(计算机科学)
模式识别(心理学)
图像分割
词(群论)
语义数据模型
主题模型
专题地图
作者
Xiaokang Zhang,Chufeng Zhou,Jianzhong Huang,Lefei Zhang
标识
DOI:10.1109/tgrs.2025.3624767
摘要
Remote sensing semantic segmentation faces significant challenges in open-world scenarios due to domain gaps and the presence of unseen categories in the test datasets. Open-vocabulary semantic segmentation (OVSS) based on vision-language models (VLMs) has emerged as a promising paradigm for remote sensing imagery interpretation, which enables adaptation to new datasets with arbitrary semantic categories. However, current OVSS approaches often struggle to achieve fine-grained pixel-level localization and classification for unseen categories when relying solely on fixed textual prompts and pretrained VLM encoders. The model’s generalization capability is further hindered by insufficiently fine-grained and adaptive textual representations. To address these limitations, we propose TPOV-Seg, a textually enhanced prompt tuning approach for OVSS Specifically, a remote sensing-specific Text TempLator (TTL) is introduced to enrich textual prompts and semantic representations for land cover categories by incorporating synonymous vocabulary combinations. To efficiently align the text encoder with remote sensing characteristics, a Lightweight Text-aware Prompt Tuning (LTP-Tuning) strategy is proposed for contextual modeling of word embeddings adaptation. Furthermore, a Textual-Guided Channel-Aware Aggregator (TGCA) is developed to promote inter-channel feature interaction and facilitate semantic modeling, leveraging Grouped Cross-Channel Transformers and linear Transformers under the guidance of enhanced textual features from TTL. Extensive experiments on five large-scale remote sensing segmentation datasets demonstrate that TPOV-Seg outperforms existing methods in OVSS tasks, showing strong discriminative ability for unseen categories while maintaining robust cross-domain generalization. The source codes will be available at https://github.com/zxk688/TPOVSeg.
科研通智能强力驱动
Strongly Powered by AbleSci AI