Utilizing Machine Learning Techniques for Classifying Translated and Non-Translated Corporate Annual Reports

计算机科学 人工智能 机器学习 自然语言处理 数据科学 情报检索
作者
Zhongliang Wang,Ming Liu,Kanglong Liu
出处
期刊:Applied Artificial Intelligence [Taylor & Francis]
卷期号:38 (1) 被引量:3
标识
DOI:10.1080/08839514.2024.2340393
摘要

Globalization has led to the widespread adoption of translated corporate annual reports in international markets. Nonetheless, it remains largely unexplored whether these translated documents fulfill the same function and communicate as effectively to international investors as their non-translated counterparts. Considering their significance to stakeholders, differentiating between these two types of reports is essential, yet research in this area is insufficient. This study seeks to bridge this gap by leveraging machine learning algorithms to classify corporate annual reports based on their translation status. By constructing corpora of comparable texts and employing thirteen syntactic complexity indices as features, we analyzed the reports using eight different algorithms: Naïve Bayes, Logistic Regression, Support Vector Machine, k-Nearest Neighbors, Neural Network, Random Forest, Gradient Boosting and Deep Learning. Additionally, ensemble models were created by combining the three most effective algorithms. The best-performing model in our study achieved an Area Under the Curve (AUC) of 99.3%. This innovative approach demonstrates the effectiveness of syntactic complexity indices in machine learning for classifying translational language in corporate reporting, contributing valuable insights to text classification and translational language research. Our findings offer critical implications for stakeholders in multilingual contexts, highlighting the need for further research in this field.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
xxx完成签到,获得积分10
刚刚
大个应助DoctorX采纳,获得10
刚刚
xiaoli发布了新的文献求助10
3秒前
情怀应助xxx采纳,获得10
4秒前
rose123456完成签到,获得积分20
4秒前
昏睡的静丹完成签到,获得积分10
6秒前
钟秋霞完成签到,获得积分10
7秒前
9秒前
SLJK发布了新的文献求助10
9秒前
9秒前
脑洞疼应助郁奥古采纳,获得10
10秒前
scott_zip完成签到 ,获得积分10
10秒前
11秒前
wanci应助damahayu采纳,获得10
12秒前
传奇3应助十先生的猫采纳,获得10
13秒前
123发布了新的文献求助10
14秒前
15秒前
16秒前
17秒前
不包含特殊字符完成签到,获得积分10
17秒前
小赵发布了新的文献求助10
18秒前
21秒前
柒咩咩发布了新的文献求助10
22秒前
22秒前
22秒前
123完成签到,获得积分10
23秒前
23秒前
tw0125完成签到 ,获得积分10
24秒前
bkagyin应助我我我采纳,获得10
24秒前
24秒前
25秒前
小赵完成签到,获得积分10
25秒前
li123xxx发布了新的文献求助10
26秒前
汉堡包应助xxx采纳,获得10
26秒前
YOURINZ完成签到,获得积分10
26秒前
李向东发布了新的文献求助10
27秒前
领导范儿应助清新的音响采纳,获得10
27秒前
27秒前
啦啦啦完成签到,获得积分10
28秒前
28秒前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
Continuum Thermodynamics and Material Modelling 2000
Encyclopedia of Geology (2nd Edition) 2000
105th Edition CRC Handbook of Chemistry and Physics 1600
Maneuvering of a Damaged Navy Combatant 650
Mixing the elements of mass customisation 300
the MD Anderson Surgical Oncology Manual, Seventh Edition 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3778211
求助须知:如何正确求助?哪些是违规求助? 3323857
关于积分的说明 10216183
捐赠科研通 3039074
什么是DOI,文献DOI怎么找? 1667762
邀请新用户注册赠送积分活动 798383
科研通“疑难数据库(出版商)”最低求助积分说明 758366