The implications of handwritten text recognition for accessing the past at scale

计算机科学 独创性 元数据 蓝图 数据科学 叙述的 奖学金 抄写(语言学) 人工智能 万维网 社会学 政治学 定性研究 机械工程 社会科学 语言学 哲学 法学 工程类
作者
Joseph Nockels,Paul Gooding,Melissa Terras
出处
期刊:Journal of Documentation [Emerald Publishing Limited]
卷期号:80 (7): 148-167 被引量:1
标识
DOI:10.1108/jd-09-2023-0183
摘要

Purpose This paper focuses on image-to-text manuscript processing through Handwritten Text Recognition (HTR), a Machine Learning (ML) approach enabled by Artificial Intelligence (AI). With HTR now achieving high levels of accuracy, we consider its potential impact on our near-future information environment and knowledge of the past. Design/methodology/approach In undertaking a more constructivist analysis, we identified gaps in the current literature through a Grounded Theory Method (GTM). This guided an iterative process of concept mapping through writing sprints in workshop settings. We identified, explored and confirmed themes through group discussion and a further interrogation of relevant literature, until reaching saturation. Findings Catalogued as part of our GTM, 120 published texts underpin this paper. We found that HTR facilitates accurate transcription and dataset cleaning, while facilitating access to a variety of historical material. HTR contributes to a virtuous cycle of dataset production and can inform the development of online cataloguing. However, current limitations include dependency on digitisation pipelines, potential archival history omission and entrenchment of bias. We also cite near-future HTR considerations. These include encouraging open access, integrating advanced AI processes and metadata extraction; legal and moral issues surrounding copyright and data ethics; crediting individuals’ transcription contributions and HTR’s environmental costs. Originality/value Our research produces a set of best practice recommendations for researchers, data providers and memory institutions, surrounding HTR use. This forms an initial, though not comprehensive, blueprint for directing future HTR research. In pursuing this, the narrative that HTR’s speed and efficiency will simply transform scholarship in archives is deconstructed.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
迅速白开水完成签到 ,获得积分10
刚刚
Cosmo_Caramel完成签到,获得积分10
1秒前
xr发布了新的文献求助10
1秒前
xy完成签到 ,获得积分10
1秒前
修狗狗完成签到,获得积分10
1秒前
在鹿特丹完成签到 ,获得积分10
1秒前
2秒前
大海完成签到,获得积分10
2秒前
小二郎应助于禄祥采纳,获得10
3秒前
3秒前
小蘑菇应助平常香采纳,获得10
4秒前
Avie完成签到 ,获得积分10
5秒前
风中白翠完成签到 ,获得积分10
5秒前
aaronroseman完成签到 ,获得积分10
7秒前
wanggayi发布了新的文献求助10
7秒前
8秒前
温故知新完成签到,获得积分10
8秒前
taff发布了新的文献求助10
9秒前
Copyright应助榕俊采纳,获得10
10秒前
努力考研完成签到,获得积分10
10秒前
10秒前
黑浩源完成签到,获得积分10
11秒前
忆墙完成签到,获得积分20
12秒前
13秒前
dyh825963发布了新的文献求助10
13秒前
科研通AI6.2应助xq采纳,获得10
13秒前
15秒前
17秒前
善良的达发布了新的文献求助30
19秒前
DST完成签到,获得积分10
20秒前
阔达囧完成签到 ,获得积分10
21秒前
情怀应助气球采纳,获得10
21秒前
门门发布了新的文献求助10
21秒前
dreamsci完成签到 ,获得积分10
21秒前
zzy完成签到 ,获得积分10
21秒前
Lucas应助未命名采纳,获得10
21秒前
22秒前
Larry1226发布了新的文献求助10
22秒前
22秒前
Leo完成签到 ,获得积分10
22秒前
高分求助中
液晶指向矢仿真分析数据集 8888
GL 2 A method for assessing the in-place cleanability of food processing equipment, Fourth Edition, December 2023 3000
Invited Discussant 63O and 64O 1000
Ideology and Meaning-Making under the Putin Regime 750
Advanced Memory Technology 500
Petrology and Plate Tectonics 500
Writing Systems 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 计算机科学 化学工程 生物化学 物理 内科学 复合材料 催化作用 光电子学 物理化学 电极 细胞生物学 基因 遗传学
热门帖子
关注 科研通微信公众号,转发送积分 6863856
求助须知:如何正确求助?哪些是违规求助? 8566753
关于积分的说明 18216098
捐赠科研通 6231884
什么是DOI,文献DOI怎么找? 3048584
关于科研通互助平台的介绍 2049853
邀请新用户注册赠送积分活动 2026293