人工智能
计算机科学
模式识别(心理学)
像素
边距(机器学习)
图形
卷积神经网络
上下文图像分类
图像(数学)
机器学习
理论计算机科学
作者
Radu Alexandru Cosma,Lukas Knobel,Putri van der Linden,David M. Knigge,Erik J. Bekkers
标识
DOI:10.1109/iccvw60793.2023.00018
摘要
While Convolutional Neural Networks and Vision Transformers are the go-to solutions for image classification, their model sizes make them expensive to train and deploy. Alternatively, input complexity can be reduced following the intuition that adjacent similar pixels contain redundant information. This prior can be exploited by clustering such pixels into superpixels and connecting adjacent superpixels with edges, resulting in a sparse graph representation on which Graph Neural Networks (GNNs) can operate efficiently. Although previous work clearly highlights the computational efficiency of this approach, this prior can be overly restrictive and, as a result, performance is lacking compared to contemporary dense vision methods. In this work, we propose to extend this prior by incorporating shape information into the individual super-pixel representations. This is achieved through a separate, patch-level GNN. Together with enriching the previously explored appearance and pose information of superpixels and further architectural changes, our best model, ShapeGNN, surpasses the previous state-of-the-art in superpixel-based image classification on CIFAR-10 by a significant margin. We also present an optimised pipeline for efficient image-to-graph transformation and show the viability of training end-to-end on high-resolution images on ImageNet-1k. 1
科研通智能强力驱动
Strongly Powered by AbleSci AI