Blueprint separable convolution Transformer network for lightweight image super-resolution

蓝图 可分离空间 变压器 计算机科学 图像(数学) 卷积(计算机科学) 人工智能 数学 电气工程 工程类 机械工程 数学分析 电压 人工神经网络
作者
Xiuping Bi,Shi Chen,Lefei Zhang
出处
期刊:Journal of Image and Graphics [University of Portsmouth]
卷期号:29 (4): 875-889 被引量:2
标识
DOI:10.11834/jig.230225
摘要

目的 图像超分辨率重建的目的是将低分辨率图像复原出具有更丰富细节信息的高分辨率图像。近年来,基于Transformer的深度神经网络在图像超分辨率重建领域取得了令人瞩目的性能,然而,这些网络往往参数量巨大、计算成本较高。针对该问题,设计了一种轻量级图像超分辨率重建网络。方法 提出了一种轻量级图像超分辨率的蓝图可分离卷积Transformer网络(blueprint separable convolution Transformer network,BSTN)。基于蓝图可分离卷积(blueprint separable convolution,BSConv)设计了蓝图前馈神经网络和蓝图多头自注意力模块。然后设计了移动通道注意力模块(shift channel attention block,SCAB)对通道重点信息进行加强,包括移动卷积、对比度感知通道注意力和蓝图前馈神经网络。最后设计了蓝图多头自注意力模块(blueprint multi-head self-attention block,BMSAB),通过蓝图多头自注意力与蓝图前馈神经网络以较低的计算量实现了自注意力过程。结果 本文方法在4个数据集上与10种先进的轻量级超分辨率方法进行比较。客观上,本文方法在不同数据集上取得了不同程度的领先,并且参数量和浮点运算量都处于较低水平。当放大倍数分别为2、3和4时,在Set5数据集上相比SOTA(state-of-theart)方法,峰值信噪比(peak signal to noise ratio,PSNR)分别提升了0.11dB、0.16dB和0.17dB。主观上,本文方法重建图像清晰,模糊区域小,具有丰富的细节。结论 本文所提出的蓝图可分离卷积Transformer网络BSTN以较少的参数量和浮点运算量达到了先进水平,能获得高质量的超分辨率重建结果。;Objective Image super-resolution aims to enhance the resolution and quality of low-resolution images,making them more visually appealing and suitable for human or machine recognition.By utilizing a series of degraded lowresolution images with coarse details,the objective is to reconstruct high-resolution images with finer details.The applications of super-resolution algorithms are vast and encompass areas,such as object detection,medical pathological analysis,remote sensing satellite images,and security monitoring.The promising prospects of these applications have led to an increased recognition of the importance of image super-resolution algorithms among researchers.With the advancement of deep learning in computer vision,this method has been successfully applied to image super-resolution,leading to significant achievements.However,the substantial number of parameters and the computational requirements of super-resolution models result in slow running speeds,limiting their practicality in real-world development and generation,particularly in mobile and edge devices.To address this issue,several lightweight super-resolution models have been proposed.Among these models,the Transformer-based approach stands out because it provides rich detail information in reconstructed images.However,this type of model still suffers from computational redundancy and large model size.To overcome these challenges,this study presents a novel lightweight super-resolution network based on the Transformer architecture.Method A blueprint separable convolution Transformer network (BSTN) is proposed for lightweight image superresolution.BSTN is divided into three parts:shallow feature extraction,deep feature extraction,and image reconstruction.In the shallow feature extraction stage,a 3 × 3 standard convolution operation is employed to extract low-level features from the input image.This initial feature extraction step helps capture basic image information,which is directly transmitted to the tail of the network to provide residual information via the long skip connection.The deep feature extraction component is composed of four successive residual attention Transformer groups(RATGs).The key elements within this stage are the shift channel attention module(SCAB)and the blueprint multi-head self-attention block(BMSAB).SCAB and BMSAB are combined to form the hybrid attention Transformer module(HATB).Two HATBs are connected together with a residual connection,and a standard convolution operation is applied to follow the two HATBs to construct the RATG.The blueprint feed-forward neural network is first designed for effectively suppressing low-information features and retaining only relevant and useful information.Then,the blueprint feed-forward neural network is introduced into the two aforementioned attention modules to efficiently extract the significant deep features for super-resolution.SCAB consists of three major components:shift convolution,contrast-aware channel attention,and blueprint feed-forward neural networks.Shift convolution reduces the number of network parameters and performs spatial information aggregation,enabling effective information fusion across different regions of the image.The contrast-aware channel attention mechanism focuses on important channel information,enhancing the representation of crucial features.BMSAB consists of a blueprint multihead self-attention and a blueprint feed-forward neural network.This module allows for the extraction of self-attention with reduced computational complexity while suppressing low-information features through the blueprint feed-forward neural network.Finally,the shallow features extracted in the earlier stage and the deep features obtained from the RATGs are added together.The combined features are then processed using pixel shuffle,a technique that rearranges features to increase their spatial resolution.This final step generates the reconstructed high-resolution image with improved quality and detail.By utilizing the designed architecture and specific components,the proposed lightweight super-resolution network achieves effective feature extraction,self-attention calculation,and image reconstruction,addressing the challenges of parameter redundancy and large model size commonly encountered in Transformer-based super-resolution models.Our method is implemented using PyTorch on NVIDIA RTX 3090 GPU.The training datasets used in this study are DIV2K and Flicr2K,which consist of 800 and 1 000 images,respectively.Batch size is set to 32,and the patch size of the training data is set to 48 × 48 pixels.The initial learning rate is set to 5×10-4 and updated with an Adam optimizer by using a cosine descent strategy,while the total iteration is 106.Result The proposed method is compared with 11 state-of-the-art approaches on 4 datasets.In accordance with the quantitative results,the proposed method has achieved varying degrees of improvement in different magnifications and datasets,while parameter size and floating-point operations are at low levels.When the magnification factor is 2,the peak signal to noise ratio (PSNR)of this model is ranked first place on Set5,Set14,BSD100,and Urban100.It performs well on Set5 and Set14,surpassing the second-best model by 0.11 dB and 0.08 dB,respectively.When the magnification factor is 3,the PSNR also ranks first place,surpassing Set5 and Urban100 by 0.16 dB and 0.06 dB,respectively.When magnification is 4,it still ranks first place and outperforms the second-place models by 0.17,0.05,and 0.04 dB on Set5,BSD100,and Urban100,respectively.In accordance with the qualitative results,the reconstructed image of the proposed method is clear,the blurred area is small,and details are rich.Conclusion A large number of comparative experiments and ablation studies demonstrate that the proposed EBST not only achieves state-of-theart super-resolution results with excellent quantitative and visual performance,but it also has fewer parameters and floating-point operations.In particular,the proposed blueprint separable multi-head self-attention can effectively perform selfattention in Transformer blocks through a concise structure.The proposed blueprint feed-forward neural network can focus on helpful information and filter out useless information for super-resolution,resulting in high efficiency and low cost.It can be seamlessly integrated into other modules.Although our method performs well,its advantages in terms of lightweight models are in evident and should be further enhanced.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
shen完成签到,获得积分10
刚刚
123发布了新的文献求助10
2秒前
Jasper应助shen采纳,获得10
3秒前
哈哈发布了新的文献求助10
5秒前
帝国之花应助谨慎的寒松采纳,获得10
5秒前
loong应助谨慎的寒松采纳,获得10
5秒前
6秒前
6秒前
若冰发布了新的文献求助20
6秒前
orixero应助ffy采纳,获得10
7秒前
8秒前
成永福完成签到,获得积分10
9秒前
huihui发布了新的文献求助10
10秒前
Ava应助青岛彭于晏采纳,获得10
10秒前
无奈的眼神完成签到,获得积分10
11秒前
脑洞疼应助崔龙锋采纳,获得10
11秒前
12秒前
12秒前
zh发布了新的文献求助10
12秒前
000200完成签到,获得积分10
15秒前
15秒前
silence完成签到,获得积分10
16秒前
初识发布了新的文献求助10
17秒前
RRRickyyy完成签到 ,获得积分10
18秒前
18秒前
amazeman111发布了新的文献求助10
19秒前
星辰大海应助青岛彭于晏采纳,获得10
20秒前
123完成签到,获得积分10
21秒前
21秒前
22秒前
22秒前
宋可乐完成签到,获得积分10
22秒前
杨尹鉴发布了新的文献求助10
22秒前
23秒前
orixero应助yjdjskd123采纳,获得10
24秒前
量子星尘发布了新的文献求助10
24秒前
无聊的日记本完成签到,获得积分10
26秒前
26秒前
27秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Introduction to strong mixing conditions volume 1-3 5000
Clinical Microbiology Procedures Handbook, Multi-Volume, 5th Edition 2000
从k到英国情人 1500
Ägyptische Geschichte der 21.–30. Dynastie 1100
„Semitische Wissenschaften“? 1100
Real World Research, 5th Edition 800
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5737072
求助须知:如何正确求助?哪些是违规求助? 5370628
关于积分的说明 15334769
捐赠科研通 4880833
什么是DOI,文献DOI怎么找? 2623041
邀请新用户注册赠送积分活动 1571886
关于科研通互助平台的介绍 1528738