Improved YOLOv7 models based on modulated deformable convolution and swin transformer for object detection in fisheye images

人工智能 计算机视觉 计算机科学 变压器 对象(语法) 卷积(计算机科学) 目标检测 模式识别(心理学) 物理 电压 人工神经网络 量子力学
作者
Jie Zhou,Degang Yang,Tingting Song,Yichen Ye,Xin Zhang,Yingze Song
出处
期刊:Image and Vision Computing [Elsevier BV]
卷期号:144: 104966-104966 被引量:2
标识
DOI:10.1016/j.imavis.2024.104966
摘要

Thanks to the wide view field, the fisheye camera can get much more visual information. Thus, it is widely used in the field of computer vision. However, projection is often required for fisheye images to be used for object detection. Meanwhile, the projection will lead to distortion in fisheye images, and the discontinuous image edges will make the objects incomplete. Fisheye images are characterized by objects that are large near and small far. These problems are still challenges for the existing advanced object detector YOLOv7. Therefore, in this paper, we propose an improved YOLOv7 model. First, Modulated Deformable Convolution is introduced into the YOLOv7 model to automatically adapt to distortion changes of distorted objects in fisheye images. It not only adjusts the sampling position of the convolutional kernel but also further extends the deformation range. The improved model can efficiently extract features of distorted and edge-discontinuous objects. In addition, fisheye images are characterized by objects close to the fisheye lens being large, while objects farther away from the fisheye lens will be smaller. To further optimize the detection performance of small objects in fisheye images, Swin Transformer is also introduced into the YOLOv7 model, and Swin Transformer Block with Window Multi-head Self-Attention (W-MSA) Effectively enhances Network Local Perception. Finally, our proposed model achieves up to 2.4% improvement in mAP compared to the original YOLOv7 model on the ERP-360 dataset. Also, the proposed model achieves the best results compared to other state-of-the-art object detection methods for equirectangular projection images. On the VOC-360 dataset, our proposed model improves the mAP by up to 5.9% compared to the original YOLOv7 model. The experimental results show that the proposed models achieve good results for object detection in both fisheye images and equirectangular projection images. The ERP-360 dataset, source code and pre-trained models for related tasks can be found at https://github.com/xiaoxiaomichong/ERP-360dataset.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
2秒前
2秒前
斯文败类应助万物更始采纳,获得10
3秒前
老茗同学发布了新的文献求助10
5秒前
氟锑酸发布了新的文献求助10
7秒前
lizhiqian2024发布了新的文献求助10
8秒前
8秒前
科科克尔克完成签到 ,获得积分20
11秒前
酷波er应助lxcy0612采纳,获得30
11秒前
12秒前
15秒前
memory发布了新的文献求助10
16秒前
等待白安完成签到 ,获得积分10
16秒前
21秒前
Jasper应助听话的寒天采纳,获得10
22秒前
big龙完成签到,获得积分10
25秒前
CodeCraft应助jacki采纳,获得10
26秒前
卢明月完成签到,获得积分10
27秒前
刘某发布了新的文献求助10
27秒前
张杨发布了新的文献求助10
28秒前
麻薯炸弹完成签到,获得积分10
31秒前
ww发布了新的文献求助10
32秒前
科研通AI5应助Moeim Keller采纳,获得10
34秒前
棒棒糖发布了新的文献求助10
35秒前
37秒前
记录者完成签到 ,获得积分10
40秒前
41秒前
41秒前
42秒前
所所应助何YI采纳,获得10
44秒前
45秒前
Dr发布了新的文献求助10
46秒前
九敏完成签到,获得积分10
47秒前
不知道取啥名完成签到 ,获得积分20
48秒前
48秒前
Hello应助张杨采纳,获得10
49秒前
Jasper应助芝士土豆泥采纳,获得10
50秒前
50秒前
专注的晋鹏完成签到 ,获得积分10
50秒前
高分求助中
Encyclopedia of Mathematical Physics 2nd edition 888
Introduction to Strong Mixing Conditions Volumes 1-3 500
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
Optical and electric properties of monocrystalline synthetic diamond irradiated by neutrons 320
共融服務學習指南 300
Essentials of Pharmacoeconomics: Health Economics and Outcomes Research 3rd Edition. by Karen Rascati 300
Peking Blues // Liao San 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3802485
求助须知:如何正确求助?哪些是违规求助? 3348111
关于积分的说明 10336668
捐赠科研通 3064039
什么是DOI,文献DOI怎么找? 1682365
邀请新用户注册赠送积分活动 808078
科研通“疑难数据库(出版商)”最低求助积分说明 763997