补语(音乐)
搭配(遥感)
计算机科学
自然语言处理
动词
人工智能
语义学(计算机科学)
语言学
程序设计语言
生物化学
基因
机器学习
表型
哲学
化学
互补
作者
Tian Shao,Gaoqi Rao,Shiquan Zhai,Endong Xun
标识
DOI:10.1109/ialp57159.2022.9961303
摘要
In natural language understanding, the marked verb-complement structure can be identified by the complement markers such as $\unicode{x5F97}$(an auxiliary word). However, it is difficult to correctly identify the unmarked verb-complement structure and label the semantic categories expressed by it. The current researches lack a semantic collocation database for the full picture of the verb-complement structure. This paper builds a semantic collocation database based on a large-scale Chinese chunkbank for the verb-complement structure. Firstly, we summarized a comprehensive semantic classification system of the verb-complement structure; moreover, we subclassified the major categories and added the extensional meaning of complement, including seven major categories and twenty-five subcategories. Secondly, we formalized the verb-complement structure and wrote twenty-seven types of formal expressions. Then, we extracted the verb-complement collocation from the large-scale Chinese chunkbank according to the formal expressions, disambiguated, and semantically subclassified the extraction results. Finally, a semantic collocation database containing about 180,000 verb-complement collocations is constructed and statistically analyzed. This paper provides a relevant research paradigm for constructing the semantic collocation database and the proper structure and rich semantic knowledge for natural language understanding.
科研通智能强力驱动
Strongly Powered by AbleSci AI