Knowledge Distillation and Contrastive Learning for Detecting Visible-Infrared Transmission Lines Using Separated Stagger Registration Network
人工智能
计算机科学
过度拟合
机器学习
深度学习
噪音(视频)
算法
人工神经网络
图像(数学)
作者
Weixing Zhou,Yusen Wang,Xiaohong Qian
出处
期刊:IEEE Transactions on Circuits and Systems I-regular Papers [Institute of Electrical and Electronics Engineers] 日期:2025-01-01卷期号:: 1-13被引量:2
标识
DOI:10.1109/tcsi.2024.3521933
摘要
Multimodal transmission-line detection (TLD) and other vision-related tasks in smart grids have garnered increasing attention due to advances in deep-learning technologies and escalating need for reliable power supplies. However, current TLD methodologies encounter several limitations. First, complex weather conditions often introduce substantial background noise, resulting in inaccurate object detection. Second, high parameter counts of extant models impede their deployment in real-world applications. Third, insufficient data samples results in overfitting and instability. To address these challenges, we proposed a separated stagger registration network (SSRNet-S $^\ast)$ , augmented with knowledge distillation (KD) and contrastive learning, specifically designed for RGB-T TLD. This method integrates a separated stagger registration mechanism into the fusion module to investigate relationships between cross-modal features. This approach enhances feature representation and effectively reduces background noise. Additionally, we devised a joint training framework incorporating KD and contrastive learning and proposed a hierarchical distillation strategy to compress the model while mitigating the impact of limited data samples. Complementary features were captured at various stages of SSRNet-S $^\ast$ by employing three levels of distillation. Extensive experiments on a TLD dataset demonstrated that both SSRNet-T and SSRNet-S $^\ast$ (with KD) outperform state-of-the-art methods. When using P2T-Large and P2T-Tiny as backbone networks in SSRNet-T and SSRNet-S $^\ast$ , respectively, the number of parameters decreased from 68.37M to 15.06M, and the computational floating-point operations decreased from 26.99G to 3.01G. Our code and results are available at https://github.com/WangYuSenn/SSRNet-KD.