计算机科学
编码(内存)
人工智能
加权
特征(语言学)
代表(政治)
蒸馏
透视图(图形)
特征向量
机器学习
模式识别(心理学)
哲学
法学
化学
有机化学
放射科
政治
医学
语言学
政治学
作者
Guohao Peng,Yifeng Huang,Heshan Li,Zhenyu Wu,Danwei Wang
标识
DOI:10.1109/iros47612.2022.9982272
摘要
Visual Place Recognition (VPR) has become an indispensable capacity for mobile robots to operate in large-scale environments. Existing methods in this field mostly focus on exploring high-performance encoding strategies, while few attempts are devoted to lightweight models that balance per-formance and computational cost. In this work, we propose a Lightweight Self-attentional Distillation Network (LSDNet) aiming to obtain advantages of both performance and efficiency. (1) From a performance perspective, an attentional encoding strategy is proposed to integrate crucial information in the scene. It extends the NetVlad architecture with a self-attention module to facilitate non-local information interaction between local features. Through further visual word vector rescaling, the final image representation can benefit from both non-local spatial integration and cluster-wise weighting. (2) From an efficiency perspective, LSDNet is built upon a lightweight back-bone. To maintain comparable performance to large backbone models, a dual distillation strategy is introduced. It prompts LSDNet to learn both encoding patterns in the hidden space and feature distributions in the encoding space from the teacher model. Through distillation-augmented training, LSDNet is able to rival the teacher model and outperform SOTA global representations with the same lightweight backbone.
科研通智能强力驱动
Strongly Powered by AbleSci AI