计算机科学
人工智能
安全性令牌
变压器
卷积神经网络
特征提取
像素
模式识别(心理学)
特征(语言学)
计算机视觉
特征学习
工程类
电气工程
哲学
语言学
计算机安全
电压
作者
Lili Shen,XU Shao-hu,Jing Zhang,Bo Peng
标识
DOI:10.1117/1.jei.32.2.023035
摘要
Image aesthetic assessment (IAA) is a challenging task in computer vision fields, which aims to automatically evaluate image beauty by simulating human perception on image aesthetic. With the development of deep learning, although convolutional neural network (CNN)-based IAA approaches have achieved extraordinary progress, CNN experiences difficulty to capture long-distance relationships among visual elements. There is a strong correlation between image layout and image semantic information for image aesthetic. In order to solve this problem, an another scale-guided parallel transformer is proposed, including a multiscale local feature extractor (ME), a feature projection (FP), and an another scale-guided parallel feature fusion transformer (AST). The ME captures primary local features with classic ResNet at multiple scales. The FP performs dimension transformation on feature maps for each scale, which can obtain feature token and aesthetic token. The AST with two parallel transformer encoders is exploited to highlight the significant regions in the holistic image, in which the feature tokens and the aesthetic token from another scale are grouped together to obtain interscale guidance. The final score distribution is achieved by weighting multiple aesthetic tokens with learnable parameters for unified aesthetics assessment. Extensive experiments on two public datasets, including aesthetic visual analysis and aesthetics and attributes database, demonstrate that the proposed method outperforms the state-of-the-art methods across three different tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI