发布文献求助

Using ChatGPT for Entity Matching

计算机科学变压器匹配（统计）任务（项目管理）人工智能背景（考古学）训练集集合（抽象数据类型）相似性（几何）机器学习数据挖掘电压数学统计工程类古生物学电气工程图像（数学）程序设计语言系统工程生物

作者

Ralph Peeters,Christian Bizer

出处

期刊：Cornell University - arXiv 日期：2023-05-05 被引量：3

链接

arxiv.org arxiv.orgdoi.org

标识

DOI：10.48550/arxiv.2305.03423

摘要

Entity Matching is the task of deciding if two entity descriptions refer to the same real-world entity. State-of-the-art entity matching methods often rely on fine-tuning Transformer models such as BERT or RoBERTa. Two major drawbacks of using these models for entity matching are that (i) the models require significant amounts of fine-tuning data for reaching a good performance and (ii) the fine-tuned models are not robust concerning out-of-distribution entities. In this paper, we investigate using ChatGPT for entity matching as a more robust, training data-efficient alternative to traditional Transformer models. We perform experiments along three dimensions: (i) general prompt design, (ii) in-context learning, and (iii) provision of higher-level matching knowledge. We show that ChatGPT is competitive with a fine-tuned RoBERTa model, reaching a zero-shot performance of 82.35% F1 on a challenging matching task on which RoBERTa requires 2000 training examples for reaching a similar performance. Adding in-context demonstrations to the prompts further improves the F1 by up to 7.85% when using similarity-based example selection. Always using the same set of 10 handpicked demonstrations leads to an improvement of 4.92% over the zero-shot performance. Finally, we show that ChatGPT can also be guided by adding higher-level matching knowledge in the form of rules to the prompts. Providing matching rules leads to similar performance gains as providing in-context demonstrations.

求助该文献

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2025年影响因子查询已上线 (2025-6-18)

更新

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: zhangzf完成签到，获得积分10

1秒前; NexusExplorer上传了应助文件

1秒前; CipherSage的应助被Shellbeaze采纳，获得10

3秒前; 慕青上传了应助文件

3秒前; 乐乐上传了应助文件

4秒前; 李凌霄发布了新的文献求助10

5秒前; UMA发布了新的文献求助50

6秒前; 科研一路绿灯发布了新的文献求助10

6秒前; 桐桐上传了应助文件

7秒前; 天天快乐上传了应助文件

7秒前; 赘婿的应助被tq采纳，获得30

7秒前; 荣离枯上传了应助文件

7秒前; 小翼发布了新的文献求助10

7秒前; dl发布了新的文献求助30

8秒前; 涵泽发布了新的文献求助10

9秒前; 科研通AI6上传了应助文件

10秒前; 华仔上传了应助文件

10秒前; Ming完成签到，获得积分10

10秒前; 卿合完成签到，获得积分10

10秒前; SorryKing发布了新的文献求助10

11秒前; Camellia完成签到，获得积分10

11秒前; Lucas上传了应助文件

11秒前; 小烊发布了新的文献求助10

12秒前; 希望天下0贩的0上传了应助文件

13秒前; 聪明天蓉发布了新的文献求助10

14秒前; 典雅的纸飞机发布了新的文献求助10

14秒前; 跳跃的绿蝶完成签到，获得积分10

14秒前; 谷槐发布了新的文献求助10

15秒前; 棋子烧饼啊完成签到，获得积分10

15秒前; 科研通AI6的应助被Phy采纳，获得30

16秒前; FashionBoy的应助被卿合采纳，获得10

17秒前; 丹青发布了新的文献求助10

18秒前; AI完成签到，获得积分10

18秒前; 量子星尘发布了新的文献求助10

18秒前; done上传了应助文件

19秒前; 自由元冬完成签到，获得积分10

19秒前; CodeCraft的应助被自信的雪糕采纳，获得10

19秒前; 空心菜公主完成签到，获得积分10

20秒前; 虎虎点赞了社区帖子

21秒前; 李健的应助被menxiaomei采纳，获得10

21秒前

高分求助中: (应助此贴封号)【重要！！请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000; The Social Work Ethics Casebook: Cases and Commentary (revised 2nd ed.).. Frederic G. Reamer 1070; Introduction to Early Childhood Education 1000; 2025-2031年中国兽用抗生素行业发展深度调研与未来趋势报告 1000; List of 1,091 Public Pension Profiles by Region 871; Alloy Phase Diagrams 500; A Guide to Genetic Counseling, 3rd Edition 500

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 5420180; 求助须知：如何正确求助？哪些是违规求助？ 4535297; 关于积分的说明 14149461; 捐赠科研通 4452280; 什么是DOI，文献DOI怎么找？ 2442103; 邀请新用户注册赠送积分活动 1433615; 关于科研通互助平台的介绍 1410869

今日热心研友

西海岸第一rapper

看不了一点文献

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通