高分辨率
分辨率(逻辑)
计算机科学
遥感
地质学
人工智能
作者
Feng Zhang,Hailong Ma,Ze Li,Lei Chen
摘要
At the absence of high-frequency visual observation, low-resolution (LR) targets (e.g., objects, human body keypoints) are intrinsically difficult to detect in unconstrained images. This challenge can be further deteriorated by typical downsampling operations (e.g., pooling, stride) of existing deep networks (e.g., CNNs). To tackle this challenge, in this work we introduce a generic High-Frequency Information Preservation (HFIP) block as the replacement of existing downsampling operations. It is composed of two key components: (1) decoupled high-frequency learning that extracts the high-frequency information along the vertical and horizontal directions separately, and (2) dilated frequency-aware channel correlation that decomposes the input feature map into multiple smaller ones in a dilated manner, concatenates them in channel, and then correlates the combined channels in the frequency space. Our module can be generally integrated into existing network architectures for target detection (e.g., YOLO, HRNet). Extensive experiments on low-resolution human pose estimation and object detection tasks show that our HFIP can generally boost the performance of state-of-the-art detection models significantly, e.g., improving the object detection accuracy of YOLOv5s by an absolute margin of 3.30% in mAP under the resolution of 640×640 on the COCO benchmark.
科研通智能强力驱动
Strongly Powered by AbleSci AI