Enhancing entity resolution with multichannel BERT: a comprehensive approach
计算机科学
分辨率(逻辑)
人工智能
作者
M. Lei Geng
标识
DOI:10.1117/12.3031934
摘要
One of the primary challenges in integrating large-scale data sources is entity resolution, which involves linking records that refer to the same entity. In recent years, deep learning has emerged as a proposed solution for addressing entity resolution. however, insufficient feature extraction and inadequate feature integration during the entity resolution process have resulted in sub-optimal results. In this paper, Multi-Channel BERT for Entity Resolution (MCBER) is proposed, a method that involves first translating the target data into different languages and utilizing data augmentation to expand the labeled data. Then, these data are fed into a multi-channel BERT model for feature extraction, followed by deeper feature extraction using LSTM. Finally, abstract features are induced from hidden layers. Our method is compared with state-of-the-art entity resolution methods on publicly available datasets, and the experimental results demonstrate that higher F1 scores are achieved by our approach, and good stability is exhibited.