互补性(分子生物学)
计算机科学
情态动词
加权
人工智能
行人检测
模式识别(心理学)
融合
行人
特征(语言学)
深度学习
特征提取
数据挖掘
机器学习
工程类
遗传学
运输工程
语言学
高分子化学
化学
医学
哲学
放射科
生物
作者
Shipeng Han,Chen Chai,Min Hu,Yanni Wang,Teng Jiao,Jianqi Wang,Hao Lv
出处
期刊:Sensors
[MDPI AG]
日期:2025-04-25
卷期号:25 (9): 2727-2727
摘要
Multimodal pedestrian detection has garnered significant attention due to its potential applications in complex scenarios. The complementarity characteristics between infrared and visible modalities can enhance detection performance. However, the design of cross-modal fusion mechanisms and the in-depth exploration of inter-modal complementarity still pose challenges. To address this, we propose TCCDNet, a novel network integrating cross-modal complementarity. Specifically, the efficient multi-scale attention C2f (EMAC) is designed for the backbone, which combines the C2f structure with an efficient multi-scale attention mechanism to achieve feature weighting and fusion, thereby enhancing the model’s feature extraction capacity. Subsequently, the cross-modal complementarity (CMC) module is proposed, which enhances feature discriminability and object localization accuracy through a synergistic mechanism combining channel attention and spatial attention. Additionally, a deep semantic fusion module (DSFM) based on a cross-attention mechanism is incorporated to achieve deep semantic feature fusion. The experimental results demonstrate that TCCDNet achieves a MR−2 of 7.87% on the KAIST dataset, representing a 3.83% reduction compared to YOLOv8. For the other two multimodal pedestrian detection datasets, TCCDNet attains mAP50 scores of 83.8% for FLIR ADAS and 97.3% for LLVIP, outperforming the baseline by 3.6% and 1.9% respectively. These results fully validate the effectiveness and advancement of the proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI