Data Augmentation for Offline Arabic Handwritten Text Recognition Using Moving Least Squares

计算机科学 笔迹 人工智能 卷积神经网络 任务(项目管理) 深度学习 手写体识别 阿拉伯语 生成语法 自然语言处理 人工神经网络 语音识别 模式识别(心理学) 特征提取 语言学 哲学 管理 经济
作者
Mohamed Amine Chadli,Rochdi Bachir Bouiadjra,Abdelkader Fekir,Jesús Martínez-Gómez,José A. Gámez
出处
期刊:Revue d'intelligence artificielle [International Information and Engineering Technology Association]
卷期号:38 (1): 1-9 被引量:2
标识
DOI:10.18280/ria.380101
摘要

This paper addresses the research problem of Offline Arabic Handwriting Text Recognition (HTR).One of the most important approaches to HTR systems is deep learning.A large amount of annotated data is needed to train deep learning-based HTR systems.The Arabic language is spoken by hundreds of millions of people in North Africa and the Middle East.Writing styles and common words differ significantly between those regions.Due to the great diversity possible, designing a statistically represented and balanced database of Arabic handwritten texts by gathering and labeling the texts is an arduous task to achieve.One of the ways to enrich the training databases is by augmenting the existing data.We have developed a new data augmentation technique for Arabic handwritten texts using Moving Least Squares (MLS) to deform the images.This technique results in realistic images that look like manipulating real-world images, and the deformations are done using linear functions that produce deformations in real time.We aim to deform the training data images randomly in a way that the text present in the images is still recognizable by a human.This augmentation technique can be used directly on images to augment them unlike other techniques such as Generative Adversarial Networks (GAN) where they must be trained beforehand.At the same time, it produces new complex augmented images compared to simple traditional augmentation techniques such as rotations and translations.In addition to this augmentation technique, we used a deep learning system called Convolutional Recurrent Neural Networks (CRNN) to test the new technique, and we have experimented with a CRNN model that accepts small input-size images to boost the time needed for both training and image augmentations.All the experimentations are carried out on the Arabic IFN/ENIT database.The results show that the small input size CRNN model outperforms the large input size CRNN model by a big margin.The results also show that the integration of images augmented by the MLS technique can help the recognition system to generalize better on the test data, therefore, it can slightly improve the performance of the recognition system.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
Leofar发布了新的文献求助10
1秒前
Keto发布了新的文献求助10
2秒前
wang完成签到,获得积分10
2秒前
TangWL完成签到 ,获得积分10
3秒前
vva完成签到,获得积分20
3秒前
4秒前
吕邓宏完成签到 ,获得积分10
5秒前
7十七完成签到 ,获得积分20
5秒前
无情的匪完成签到 ,获得积分10
5秒前
6秒前
木子林夕完成签到,获得积分10
7秒前
ABC发布了新的文献求助10
9秒前
balabla完成签到,获得积分10
9秒前
biofresh发布了新的文献求助10
9秒前
10秒前
10秒前
11秒前
张三发布了新的文献求助10
11秒前
不吃番茄的土豆墙完成签到,获得积分10
11秒前
梁33完成签到,获得积分10
11秒前
WANG发布了新的文献求助10
11秒前
12秒前
DecC完成签到 ,获得积分10
12秒前
永溺深海的猫完成签到,获得积分10
12秒前
猪猪侠发布了新的文献求助50
12秒前
直率的海云完成签到,获得积分10
13秒前
汉堡包应助超级岩采纳,获得10
13秒前
14秒前
binghe发布了新的文献求助10
14秒前
14秒前
外向访卉发布了新的文献求助10
15秒前
冷静的跌发布了新的文献求助10
15秒前
852应助禹子骞采纳,获得10
15秒前
biofresh完成签到,获得积分10
17秒前
长发飘飘完成签到 ,获得积分10
17秒前
科研通AI5应助socroz采纳,获得10
17秒前
Wayi发布了新的文献求助10
17秒前
17秒前
GAO发布了新的文献求助30
18秒前
高分求助中
Mass producing individuality 600
Algorithmic Mathematics in Machine Learning 500
Разработка метода ускоренного контроля качества электрохромных устройств 500
The Effect of Irrigation Solutions on Recurrence of Chronic Subdural Hematoma: A Consecutive Cohort Study of 234 Patients 300
Getting Published in SSCI Journals: 200+ Questions and Answers for Absolute Beginners 300
Advances in Underwater Acoustics, Structural Acoustics, and Computational Methodologies 300
Introduction to Linear Optimization, by Dimitris Bertsimas and John N. Tsitsiklis 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3828500
求助须知:如何正确求助?哪些是违规求助? 3370806
关于积分的说明 10465265
捐赠科研通 3090821
什么是DOI,文献DOI怎么找? 1700556
邀请新用户注册赠送积分活动 817893
科研通“疑难数据库(出版商)”最低求助积分说明 770571