语法
计算机科学
自然语言处理
编码(内存)
抽象语法树
人工智能
组分(热力学)
解析
抽象语法
语言学
哲学
物理
热力学
作者
Shitou Zhang,Ping Wang,Zuchao Li,Jingrui Hou,Qibiao Hu
标识
DOI:10.1016/j.ipm.2023.103616
摘要
While neural-based models continue to make rapid strides, syntax remains a foundational element in the domain of Natural Language Processing (NLP), particularly in the context of Chinese language understanding. However, there exists a significant gap in research that integrates syntactic information for the understanding of ancient Chinese, primarily due to the lack of high-quality syntactic annotations. This paper explores the untapped potential of syntax to enhance ancient Chinese understanding, leveraging the “not-so-perfect” noisy syntax trees generated by unsupervised derivations and modern Chinese syntax parsers. To achieve this, we introduce a novel syntax encoding component: the confidence-based syntax encoding network (cSEN). This component is tailored to mitigate the side-effects arising from the noise associated with unsupervised syntax derivations and the incompatibility between ancient and modern Chinese. We validate the importance of syntax information and the efficacy of our cSEN through experimental tasks, specifically ancient poetry theme classification and ancient–modern Chinese translation. Our findings suggest that proper implementation of syntactic information can effectively enhance model understanding of ancient Chinese. The introduced cSEN proves vital in noise-rich environments, potentially revolutionizing the way information professionals approach and utilize ancient Chinese texts.
科研通智能强力驱动
Strongly Powered by AbleSci AI