Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection

情态动词 突出 计算机科学 人工智能 计算机视觉 傅里叶变换 对象(语法) 数学 材料科学 数学分析 高分子化学
作者
Pengfei Lyu,Xiaosheng Yu,Yeung, Pak-Hei,Chengdong Wu,Jagath C. Rajapakse
出处
期刊:Cornell University - arXiv
标识
DOI:10.48550/arxiv.2411.18409
摘要

The rapid development of deep learning has significantly improved salient object detection (SOD) combining both RGB and thermal (RGB-T) images. However, existing Transformer-based RGB-T SOD models with quadratic complexity are memory-intensive, limiting their application in high-resolution bimodal feature fusion. To overcome this limitation, we propose a purely Fourier Transform-based model, namely Deep Fourier-embedded Network (FreqSal), for accurate RGB-T SOD. Specifically, we leverage the efficiency of Fast Fourier Transform with linear complexity to design three key components: (1) To fuse RGB and thermal modalities, we propose Modal-coordinated Perception Attention, which aligns and enhances bimodal Fourier representation in multiple dimensions; (2) To clarify object edges and suppress noise, we design Frequency-decomposed Edge-aware Block, which deeply decomposes and filters Fourier components of low-level features; (3) To accurately decode features, we propose Fourier Residual Channel Attention Block, which prioritizes high-frequency information while aligning channel-wise global relationships. Additionally, even when converged, existing deep learning-based SOD models' predictions still exhibit frequency gaps relative to ground-truth. To address this problem, we propose Co-focus Frequency Loss, which dynamically weights hard frequencies during edge frequency reconstruction by cross-referencing bimodal edge information in the Fourier domain. Extensive experiments on ten bimodal SOD benchmark datasets demonstrate that FreqSal outperforms twenty-nine existing state-of-the-art bimodal SOD models. Comprehensive ablation studies further validate the value and effectiveness of our newly proposed components. The code is available at https://github.com/JoshuaLPF/FreqSal.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
郭菱香完成签到 ,获得积分10
刚刚
丨墨月丨完成签到,获得积分0
1秒前
lym完成签到,获得积分10
2秒前
3秒前
雨濛完成签到,获得积分10
4秒前
keyan完成签到,获得积分10
4秒前
小强完成签到,获得积分20
5秒前
5秒前
connors smith完成签到,获得积分10
6秒前
sgj完成签到,获得积分10
8秒前
chenkj完成签到,获得积分0
9秒前
EricSai完成签到,获得积分0
9秒前
ikun完成签到,获得积分0
9秒前
啊啊啊发布了新的文献求助10
9秒前
dudu完成签到,获得积分10
10秒前
JunJun完成签到 ,获得积分10
11秒前
11号迪西馅饼完成签到,获得积分10
11秒前
胡33完成签到,获得积分10
11秒前
诺亚方舟哇哈哈完成签到 ,获得积分0
11秒前
ajaja完成签到 ,获得积分10
12秒前
Yang22完成签到,获得积分10
12秒前
Isabel完成签到 ,获得积分10
12秒前
connors smith发布了新的文献求助30
12秒前
zp4完成签到,获得积分10
13秒前
CipherSage应助爱听歌的青筠采纳,获得10
14秒前
jackzzs完成签到,获得积分10
16秒前
Axs完成签到,获得积分10
17秒前
一一一完成签到 ,获得积分10
19秒前
const完成签到,获得积分0
19秒前
chenxin完成签到,获得积分10
19秒前
高高从霜完成签到 ,获得积分10
19秒前
小马哥完成签到,获得积分10
19秒前
超帅的又槐完成签到,获得积分10
19秒前
FashionBoy应助自由的凌雪采纳,获得10
21秒前
占那个完成签到 ,获得积分10
22秒前
wxxl完成签到,获得积分10
24秒前
子云完成签到,获得积分20
24秒前
敏感寒云完成签到,获得积分10
25秒前
zhangzhang完成签到,获得积分10
27秒前
香蕉面包完成签到 ,获得积分10
28秒前
高分求助中
The Wiley Blackwell Companion to Diachronic and Historical Linguistics 3000
Standards for Molecular Testing for Red Cell, Platelet, and Neutrophil Antigens, 7th edition 1000
HANDBOOK OF CHEMISTRY AND PHYSICS 106th edition 1000
ASPEN Adult Nutrition Support Core Curriculum, Fourth Edition 1000
Signals, Systems, and Signal Processing 610
脑电大模型与情感脑机接口研究--郑伟龙 500
GMP in Practice: Regulatory Expectations for the Pharmaceutical Industry 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6292428
求助须知:如何正确求助?哪些是违规求助? 8110407
关于积分的说明 16967575
捐赠科研通 5355520
什么是DOI,文献DOI怎么找? 2845709
邀请新用户注册赠送积分活动 1823020
关于科研通互助平台的介绍 1678598