Chinese named entity recognition in the furniture domain based on ERNIE and adversarial learning

计算机科学 对抗制 领域(数学分析) 自然语言处理 人工智能 计算机安全 万维网 数学 数学分析
作者
Yang Song,Yanhe Jia,Jiliang Zhang
出处
期刊:International Journal of Web Information Systems [Emerald Publishing Limited]
标识
DOI:10.1108/ijwis-08-2024-0239
摘要

Purpose To solve the problems of annotation noise, ambiguity recognition and nested entity recognition in the field of Chinese furniture, this paper aims to design a new recognition model ALE-BiLSTM-CRF. Design/methodology/approach This paper addresses the relative independence of text characters in the Chinese furniture domain named entity recognition (NER) task. It also considers the limited information provided by these text characters in this task. Therefore, a model named ALE-BiLSTM-CRF for Chinese furniture domain NER is proposed. First, the ERNIE pre-trained model is used to transform text into a dynamic vector that integrates contextual information. And adversarial learning is combined to generate adversarial samples to enhance the robustness of the model. Next, the BiLSTM module captures the temporal information of the context, and the multi-head attention mechanism integrates long-distance semantic features into the character vectors. Finally, a CRF layer is used to learn the constraints between labels, enabling the model to generate more reasonable and semantically consistent label sequences. This paper conducts comparative experiments with mainstream models on the Weibo data set, achieving an F1 score of 75.52%, demonstrating its generality and robustness. Additionally, comparative and ablation experiments are conducted on a self-constructed furniture data set in the Chinese furniture field, achieving an F1 score of 89.62%, verifying the model’s superiority and effectiveness. Findings This paper conducts comparative experiments with mainstream models on the Weibo data set, achieving an F1 score of 75.52%, demonstrating its generality and robustness. Additionally, comparative and ablation experiments are conducted on a self-constructed furniture data set in the Chinese furniture field, achieving an F1 score of 89.62%, verifying the model’s superiority and effectiveness. Research limitations/implications This paper demonstrates its universality and generalization by conducting comparative experiments with mainstream models on the Weibo data set. It also conducts comparative experiments with representative pre-trained models on the furniture data set and conducts ablation experiments on the model itself, further demonstrating the superiority and effectiveness of the model. Practical implications In the furniture domain, NER aims to use various methods, including rule templates, machine learning and deep learning techniques, to extract structured information related to furniture from unstructured text. These pieces of information may include the name, material, brand, style and function of the furniture. By extracting and identifying these named entities, this paper can provide more accurate data support for furniture design, manufacturing and marketing, thereby promoting further development and innovation in the furniture industry. Social implications In the furniture field, NER faces some special challenges, which are different from entity recognition in general fields. Furniture terminology is often highly specialized and complex in structure. At the same time, there may be a large number of nested entities in the text of the furniture field. For example, the furniture name “sofa bed” contains two entities “sofa” and “bed.” Current sequence labeling methods often find it difficult to recognize such nested entity structures simultaneously. Additionally, because furniture terminology and descriptions may change with trends and design styles, the model also needs to have a certain degree of adaptability and update capabilities. These reasons make it more difficult to extract information in the furniture field, and NER in the furniture field faces huge challenges. Originality/value This paper conducts comparative experiments with mainstream models on the Weibo data set, achieving an F1 score of 75.52%, demonstrating its generality and robustness. Additionally, comparative and ablation experiments are conducted on a self-constructed furniture data set in the Chinese furniture field, achieving an F1 score of 89.62%, verifying the model’s superiority and effectiveness.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Seciy完成签到 ,获得积分10
1秒前
稳重飞飞完成签到,获得积分10
1秒前
Fury完成签到 ,获得积分10
2秒前
滴迪氐媂完成签到 ,获得积分10
4秒前
大碗完成签到 ,获得积分10
4秒前
明理的凌旋完成签到,获得积分10
5秒前
5秒前
阿托伐他汀完成签到 ,获得积分10
7秒前
8秒前
SYLH应助geoman采纳,获得10
10秒前
怡然的乘风完成签到,获得积分10
12秒前
YAO完成签到 ,获得积分10
13秒前
刚刚好完成签到,获得积分10
14秒前
lele完成签到,获得积分10
18秒前
18秒前
水濑心源完成签到,获得积分10
19秒前
19秒前
科研通AI5应助科研通管家采纳,获得10
20秒前
bkagyin应助科研通管家采纳,获得10
20秒前
小马甲应助科研通管家采纳,获得10
20秒前
Ellen完成签到,获得积分10
20秒前
大个应助科研通管家采纳,获得10
20秒前
20秒前
ding应助科研通管家采纳,获得10
21秒前
21秒前
lele发布了新的文献求助10
21秒前
22秒前
jungle发布了新的文献求助20
22秒前
西洲发布了新的文献求助10
25秒前
acd完成签到,获得积分10
25秒前
Jacey79发布了新的文献求助10
25秒前
Owen应助高会和采纳,获得10
26秒前
科研通AI5应助Danqing采纳,获得10
26秒前
一只羊发布了新的文献求助10
27秒前
28秒前
hope完成签到,获得积分10
29秒前
阿不思发布了新的文献求助10
32秒前
33秒前
科研通AI5应助jungle采纳,获得20
33秒前
Hhh发布了新的文献求助20
34秒前
高分求助中
Mass producing individuality 600
Разработка метода ускоренного контроля качества электрохромных устройств 500
A Combined Chronic Toxicity and Carcinogenicity Study of ε-Polylysine in the Rat 400
Advances in Underwater Acoustics, Structural Acoustics, and Computational Methodologies 300
Treatise on Process Metallurgy Volume 3: Industrial Processes (2nd edition) 250
Progress in Inorganic Chemistry 200
Between east and west transposition of cultural systems and military technology of fortified landscapes 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3825716
求助须知:如何正确求助?哪些是违规求助? 3367860
关于积分的说明 10448391
捐赠科研通 3087329
什么是DOI,文献DOI怎么找? 1698619
邀请新用户注册赠送积分活动 816861
科研通“疑难数据库(出版商)”最低求助积分说明 769973