先验概率
比例(比率)
计算机科学
图像(数学)
人工智能
分辨率(逻辑)
计算机视觉
超分辨率
模式识别(心理学)
贝叶斯概率
地图学
地理
作者
Zhongjie Zhu,Zhang Lei,Yongqiang Bai,Yuer Wang,Pei Li
出处
期刊:IEEE transactions on artificial intelligence
[Institute of Electrical and Electronics Engineers]
日期:2024-03-19
卷期号:5 (7): 3653-3663
被引量:6
标识
DOI:10.1109/tai.2024.3375836
摘要
Scene Text Image Super-resolution (STISR) aims to enhance the resolution of images containing text within a scene, making the text more readable and easier to recognize. This technique has broad applications in numerous fields such as autonomous driving, document scanning, image retrieval, and so on. However, most existing STISR methods have not fully exploited the multi-scale structural and semantic information within scene text images. As a result, the restored text image quality is not sufficient, significantly impacting subsequent tasks such as text detection and recognition. Hence, this paper proposes a novel scheme that leverages multi-scale structural and semantic priors to efficiently guide text semantic restoration, ultimately yielding high-quality text images. First, a multi-scale interaction attention (MSIA) module is designed to capture location-specific details of various-scale structural features and facilitate the recovery of semantic information. Second, a multi-scale prior learning module (MSPLM) is developed. Within this module, skip connections are employed among codecs to strengthen both structural and semantic prior features, thereby enhancing the up-sampling and reconstruction capabilities. Finally, building upon the MSPLM, cascaded encoders are connected through residual connections to further enrich the multi-scale features and bolster the representational capacity of the prior. Experiments conducted on the standard TextZoom dataset demonstrate that the average recognition accuracies of three evaluators—ASTER, CRNN, and MORAN—are 64.4%, 53.5%, and 60.8%, respectively, surpassing most existing methods, including the state-of-the-art ones.
科研通智能强力驱动
Strongly Powered by AbleSci AI