计算机视觉
图像分割
人工智能
计算机科学
分割
遥感
地质学
标识
DOI:10.1109/tgrs.2025.3603946
摘要
Integrating a semantic segmentation network into an Unmanned Aerial Vehicle (UAV) improves situational awareness and facilitates autonomous operations in dynamic environments. However, designing such a network for onboard deployment is challenging, as it must achieve high performance while maintaining low computational and memory requirements and ensuring real-time processing capabilities. UAVs typically operate at high altitudes, providing broad ground coverage; however, this results in key objects—such as humans, vehicles, and obstacles—appearing smaller in the imagery, thereby complicating their accurate identification. To address these challenges, we propose a lightweight semantic segmentation network and a network-agnostic loss specifically designed for UAV imagery. The Dual Sparsity Transformer (DST) incorporates two forms of sparsity: data-based sparsity, which reduces computational complexity; and content-based sparsity, which filters out irrelevant information to generate more refined aggregated features. The novel loss leverages predicted contours to capture complex patterns, boundaries, and small objects, imposing a higher penalty for misclassifications in these areas. This encourages the network to prioritize the accurate detection of challenging-to-distinguish objects. Our approach exhibits remarkable accuracy and real-time throughput for 4K resolution images on a mobile GPU, highlighting its effectiveness for onboard deployment in UAV systems.
科研通智能强力驱动
Strongly Powered by AbleSci AI