Data Augmentation for Offline Arabic Handwritten Text Recognition Using Moving Least Squares

计算机科学笔迹人工智能卷积神经网络任务（项目管理）深度学习手写体识别阿拉伯语生成语法自然语言处理人工神经网络语音识别模式识别（心理学）特征提取语言学哲学管理经济

作者

Mohamed Amine Chadli,Rochdi Bachir Bouiadjra,Abdelkader Fekir,Jesús Martínez-Gómez,José A. Gámez

出处

期刊：Revue d'intelligence artificielle [International Information and Engineering Technology Association]
日期：2024-02-29 卷期号：38 (1): 1-9 被引量：2

链接

iieta.orgdoi.org

标识

DOI：10.18280/ria.380101

摘要

This paper addresses the research problem of Offline Arabic Handwriting Text Recognition (HTR).One of the most important approaches to HTR systems is deep learning.A large amount of annotated data is needed to train deep learning-based HTR systems.The Arabic language is spoken by hundreds of millions of people in North Africa and the Middle East.Writing styles and common words differ significantly between those regions.Due to the great diversity possible, designing a statistically represented and balanced database of Arabic handwritten texts by gathering and labeling the texts is an arduous task to achieve.One of the ways to enrich the training databases is by augmenting the existing data.We have developed a new data augmentation technique for Arabic handwritten texts using Moving Least Squares (MLS) to deform the images.This technique results in realistic images that look like manipulating real-world images, and the deformations are done using linear functions that produce deformations in real time.We aim to deform the training data images randomly in a way that the text present in the images is still recognizable by a human.This augmentation technique can be used directly on images to augment them unlike other techniques such as Generative Adversarial Networks (GAN) where they must be trained beforehand.At the same time, it produces new complex augmented images compared to simple traditional augmentation techniques such as rotations and translations.In addition to this augmentation technique, we used a deep learning system called Convolutional Recurrent Neural Networks (CRNN) to test the new technique, and we have experimented with a CRNN model that accepts small input-size images to boost the time needed for both training and image augmentations.All the experimentations are carried out on the Arabic IFN/ENIT database.The results show that the small input size CRNN model outperforms the large input size CRNN model by a big margin.The results also show that the integration of images augmented by the MLS technique can help the recognition system to generalize better on the test data, therefore, it can slightly improve the performance of the recognition system.

求助该文献

Data Augmentation for Offline Arabic Handwritten Text Recognition Using Moving Least Squares

今日热心研友