计算机科学
变压器
人工智能
深度学习
计算机视觉
任务(项目管理)
机器视觉
机器学习
模式识别(心理学)
工程类
电压
电气工程
系统工程
作者
Ibrahim Yahaya Garta,Riqing Chen
标识
DOI:10.1049/icp.2023.3222
摘要
Recognizing shoe types is a computer vision task and has a wide range of practical applications. Traditional approaches for shoe recognition often involve manual cataloging and rely on descriptive image information, which can be time-consuming and imprecise. Deep learning approaches have improved the illustrative catalog system of recognizing shoes. However, advancements in Computer Vision introduced transformer models, which have shown promising results in image recognition tasks. This paper proposed a shoe recognition model using a Vision Image Transformer (ViT) to improve the above approaches. A fully connected layer was added to enable classification tasks. A constructed shoe dataset was trained on Google Colab 3. 10.12 with GPU in Keras using Adam as an optimizer, and 99.2% validation accuracy was achieved. The proposed model performs better than other deep-learning models on the dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI