地标
人工智能
计算机科学
计算机视觉
变压器
字符识别
模式识别(心理学)
性格(数学)
图像(数学)
工程类
数学
电气工程
几何学
电压
作者
Sirawich Vachmanus,Noppanan Phinklao,Naruparn Phongsarnariyakul,Thanat Plongcharoen,Seiji Hotta,Suppawong Tuarob
出处
期刊:IEEE Access
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:12: 131284-131295
标识
DOI:10.1109/access.2024.3459419
摘要
Comics, particularly Japanese manga, are a powerful medium that blends images and text to convey ideas and encapsulate a unique cultural heritage. Going beyond mere entertainment, manga merges diverse styles and content deeply rooted in Japanese cultural heritage. This study utilizes computer vision analysis, with a specific focus on facial landmark detection, acknowledging the growing significance of technology in analyzing manga images. Through a comprehensive exploration of various methods, the research identifies the extended version of Bidirectional Encoder Representations from Transformers (BERT), BERT Pre-Training of Image Transformers (BEiT), model as a standout performer due to its efficiency and effectiveness. The BEiT model’s success lies in its ability to extract facial features, consequently establishing itself as a go-to solution for landmark detection on manga faces. The outcomes achieved the lowest Failure Rate compared to other landmark detection networks, with a Failure Rate of approximately 9.4% and a Mean Average Error of about 4.6 pixels. Beyond its technical accomplishments, this study carries a cultural significance, contributing to the ongoing narrative of manga in Japan.
科研通智能强力驱动
Strongly Powered by AbleSci AI