Vehicle Detection Based on Adaptive Multimodal Feature Fusion and Cross-Modal Vehicle Index Using RGB-T Images

人工智能 计算机科学 RGB颜色模型 计算机视觉 特征(语言学) 目标检测 棱锥(几何) 特征提取 情态动词 模式识别(心理学) 数学 哲学 语言学 化学 几何学 高分子化学
作者
Yuanfeng Wu,Xinran Guan,Boya Zhao,Li Ni,Min Huang
出处
期刊:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing [Institute of Electrical and Electronics Engineers]
卷期号:16: 8166-8177 被引量:31
标识
DOI:10.1109/jstars.2023.3294624
摘要

Target detection is a critical task in interpreting aerial images. Small target detection, such as vehicles, is challenging. Different lighting conditions affect the accuracy of vehicle detection. For example, vehicles are difficult to distinguish from the background in red, green, blue (RGB) images under low illumination conditions. In contrast, under high-illumination conditions, the color and texture of vehicles are not significantly different in thermal infrared (TIR) images. To improve the accuracy of vehicle detection under various illumination conditions, we propose an adaptive multimodal feature fusion and cross-modal vehicle index (AFFCM) model for vehicle detection. Based on the single-stage object detection model, AFFCM uses RGB and TIR images. It comprises three parts: 1) the softpooling channel attention (SCA) mechanism calculates the cross-modal feature weights of the RGB and TIR features using a fully connected layer during global weighted pooling; 2) we design a multimodal adaptive feature fusion (MAFF) module based on the cross-modal feature weights derived from the SCA mechanism; the MAFF selects features with high weight, compresses redundant features with low weight, and performs adaptive fusion using a multiscale feature pyramid; and 3) a cross-modal vehicle index is established to extract the target area, suppress complex background information, and minimize false alarms in vehicle detection. The mean average precision (mAP) on the Drone Vehicle dataset is 14.44% and 5.02% higher than that obtained using only RGB or TIR images. The mAP is 2.63% higher than that of state-of-the-art methods that utilize RGB and TIR images.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
2秒前
Orange应助家伟采纳,获得10
3秒前
大胆香之完成签到,获得积分10
3秒前
4秒前
小孟吖完成签到 ,获得积分10
4秒前
万能图书馆应助孙小球采纳,获得10
5秒前
在水一方应助ainan采纳,获得10
5秒前
5秒前
Estella发布了新的文献求助10
5秒前
5秒前
鱼儿完成签到,获得积分20
6秒前
7秒前
7秒前
7秒前
8秒前
小熊噗噗发布了新的文献求助30
9秒前
ftrsh12137完成签到,获得积分10
9秒前
9秒前
hut发布了新的文献求助10
9秒前
JamesPei应助蟹蟹采纳,获得10
10秒前
an发布了新的文献求助10
10秒前
feier应助Lllll采纳,获得10
11秒前
Joeswith发布了新的文献求助10
12秒前
13秒前
水梦语完成签到,获得积分10
13秒前
伟蓓1314发布了新的文献求助10
14秒前
Echo完成签到,获得积分10
15秒前
15秒前
16秒前
16秒前
Rita应助勤奋的刺猬采纳,获得10
16秒前
研友_VZG7GZ应助Alexia2_采纳,获得10
18秒前
18秒前
18秒前
18秒前
暴躁的柚子皮完成签到,获得积分10
19秒前
毛豆爸爸应助chyang采纳,获得40
20秒前
一步一步发布了新的文献求助10
20秒前
Hello应助菜菜采纳,获得10
22秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Kinesiophobia : a new view of chronic pain behavior 2000
Psychology and Work Today 1000
Research for Social Workers 1000
Mastering New Drug Applications: A Step-by-Step Guide (Mastering the FDA Approval Process Book 1) 800
Signals, Systems, and Signal Processing 510
Discrete-Time Signals and Systems 510
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5905304
求助须知:如何正确求助?哪些是违规求助? 6778146
关于积分的说明 15761999
捐赠科研通 5029030
什么是DOI,文献DOI怎么找? 2707954
邀请新用户注册赠送积分活动 1656771
关于科研通互助平台的介绍 1601941