计算机科学
情报检索
自然语言处理
匹配(统计)
人工智能
医学
病理
作者
Jiajun Zhang,Jun‐Jie Fang,Chengkun Zhang,Wei Zhang,Hong Ren,Liuchang Xu
摘要
Geographical named entity matching, a crucial step in address encoding, aims to enhance address resolution accuracy through the precise identification and linkage of geographical named entity data. However, existing approaches tend to ignore the spatial information of entities, leading to misclassification. Drawing on the human process of searching for addresses, this study proposes a multi-objective learning model named GNEMM that integrates the semantic and spatial information of geographical named entities. To further mimic the human cognitive process during address search, it incorporates the Retrieval-Augmented Generation (RAG) technique. By integrating newly added external address data with an advanced large language model (LLM) like GPT-4, it achieves precise address evaluation and recommendation. The model was tested using a standard geographical named entity dataset from Shandong Province, focusing on three sub-tasks: element segmentation, matching, and spatial similarity score prediction. The experimental results indicate that the method achieves a geographical named entity matching accuracy of up to 99%, with improvements of 10% and 5% in the segmentation and prediction sub-tasks. GNEMM performs best in address-matching tasks of various scales, and the vectors extracted by GNEMM perform best in the downstream retrieval and matching of various address types, which verifies its applicability in geographical named entity recommendation applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI