Machine Translation Testing via Syntactic Tree Pruning

计算机科学 机器翻译 修剪 自然语言处理 人工智能 树(集合论) 翻译(生物学) 程序设计语言 数学 数学分析 农学 生物 生物化学 化学 信使核糖核酸 基因
作者
Quanjun Zhang,Juan Zhai,Chunrong Fang,Jiawei Liu,Weisong Sun,Haichuan Hu,Qingyu Wang
出处
期刊:ACM Transactions on Software Engineering and Methodology [Association for Computing Machinery]
卷期号:33 (5): 1-39 被引量:1
标识
DOI:10.1145/3640329
摘要

Machine translation systems have been widely adopted in our daily life, making life easier and more convenient. Unfortunately, erroneous translations may result in severe consequences, such as financial losses. This requires to improve the accuracy and the reliability of machine translation systems. However, it is challenging to test machine translation systems because of the complexity and intractability of the underlying neural models. To tackle these challenges, we propose a novel metamorphic testing approach by syntactic tree pruning (STP) to validate machine translation systems. Our key insight is that a pruned sentence should have similar crucial semantics compared with the original sentence. Specifically, STP (1) proposes a core semantics-preserving pruning strategy by basic sentence structures and dependency relations on the level of syntactic tree representation, (2) generates source sentence pairs based on the metamorphic relation, and (3) reports suspicious issues whose translations break the consistency property by a bag-of-words model. We further evaluate STP on two state-of-the-art machine translation systems (i.e., Google Translate and Bing Microsoft Translator) with 1,200 source sentences as inputs. The results show that STP accurately finds 5,073 unique erroneous translations in Google Translate and 5,100 unique erroneous translations in Bing Microsoft Translator (400% more than state-of-the-art techniques), with 64.5% and 65.4% precision, respectively. The reported erroneous translations vary in types and more than 90% of them are not found by state-of-the-art techniques. There are 9,393 erroneous translations unique to STP, which is 711.9% more than state-of-the-art techniques. Moreover, STP is quite effective in detecting translation errors for the original sentences with a recall reaching 74.0%, improving state-of-the-art techniques by 55.1% on average.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
Elec发布了新的文献求助10
1秒前
3Hboy发布了新的文献求助10
4秒前
落后寒凡发布了新的文献求助10
5秒前
7秒前
半缘修道半缘君完成签到 ,获得积分10
8秒前
孙旭完成签到 ,获得积分10
8秒前
koipolaris发布了新的文献求助30
11秒前
Orange应助吉康医学采纳,获得10
11秒前
Hello应助哒哒哒宰采纳,获得10
12秒前
12秒前
抗体药物偶联完成签到,获得积分10
13秒前
xin完成签到 ,获得积分10
14秒前
小犁牛完成签到 ,获得积分10
16秒前
cff发布了新的文献求助10
17秒前
19秒前
19秒前
koipolaris完成签到,获得积分10
20秒前
李爱国应助晴天采纳,获得10
23秒前
Elec完成签到,获得积分10
24秒前
24秒前
努力的小明明完成签到,获得积分10
25秒前
吉康医学发布了新的文献求助10
25秒前
25秒前
科研通AI2S应助Bk采纳,获得10
26秒前
NexusExplorer应助孝顺的灵萱采纳,获得10
27秒前
光亮的半山完成签到,获得积分10
29秒前
29秒前
32秒前
mins发布了新的文献求助10
33秒前
文艺大米完成签到 ,获得积分10
33秒前
星辰大海应助喜悦的板凳采纳,获得10
34秒前
阡陌完成签到 ,获得积分10
37秒前
仲夏发布了新的文献求助10
37秒前
冰糖葫芦完成签到 ,获得积分10
38秒前
3Hboy完成签到,获得积分10
39秒前
40秒前
狂野的尔容完成签到,获得积分10
40秒前
lingzhi完成签到 ,获得积分10
40秒前
zchchem发布了新的文献求助10
41秒前
高分求助中
Applied Survey Data Analysis (第三版, 2025) 800
Narcissistic Personality Disorder 700
Assessing and Diagnosing Young Children with Neurodevelopmental Disorders (2nd Edition) 700
The Elgar Companion to Consumer Behaviour and the Sustainable Development Goals 540
The Martian climate revisited: atmosphere and environment of a desert planet 500
Images that translate 500
Transnational East Asian Studies 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3845210
求助须知:如何正确求助?哪些是违规求助? 3387334
关于积分的说明 10548971
捐赠科研通 3108085
什么是DOI,文献DOI怎么找? 1712365
邀请新用户注册赠送积分活动 824385
科研通“疑难数据库(出版商)”最低求助积分说明 774751