图像复原
去模糊
计算机科学
人工智能
卷积神经网络
JPEG格式
图像质量
稳健性(进化)
计算机视觉
图像压缩
图像处理
模式识别(心理学)
图像(数学)
生物化学
基因
化学
作者
Anas M. Ali,Bilel Benjdira,Anis Koubâa,Walid El‐Shafai,Zahid Khan,Wadii Boulila
出处
期刊:Sensors
[Multidisciplinary Digital Publishing Institute]
日期:2023-02-21
卷期号:23 (5): 2385-2385
被引量:66
摘要
The Vision Transformer (ViT) architecture has been remarkably successful in image restoration. For a while, Convolutional Neural Networks (CNN) predominated in most computer vision tasks. Now, both CNN and ViT are efficient approaches that demonstrate powerful capabilities to restore a better version of an image given in a low-quality format. In this study, the efficiency of ViT in image restoration is studied extensively. The ViT architectures are classified for every task of image restoration. Seven image restoration tasks are considered: Image Super-Resolution, Image Denoising, General Image Enhancement, JPEG Compression Artifact Reduction, Image Deblurring, Removing Adverse Weather Conditions, and Image Dehazing. The outcomes, the advantages, the limitations, and the possible areas for future research are detailed. Overall, it is noted that incorporating ViT in the new architectures for image restoration is becoming a rule. This is due to some advantages compared to CNN, such as better efficiency, especially when more data are fed to the network, robustness in feature extraction, and a better feature learning approach that sees better the variances and characteristics of the input. Nevertheless, some drawbacks exist, such as the need for more data to show the benefits of ViT over CNN, the increased computational cost due to the complexity of the self-attention block, a more challenging training process, and the lack of interpretability. These drawbacks represent the future research direction that should be targeted to increase the efficiency of ViT in the image restoration domain.
科研通智能强力驱动
Strongly Powered by AbleSci AI