Leveraging LLMs for Unsupervised Dense Retriever Ranking

排名(信息检索) 计算机科学 数据科学 人工智能 业务 地理
作者
Ekaterina Khramtsova,Shengyao Zhuang,Mahsa Baktashmotlagh,Guido Zuccon
出处
期刊:Cornell University - arXiv 被引量:4
标识
DOI:10.1145/3626772.3657798
摘要

In this paper we present Large Language Model Assisted Retrieval Model Ranking (LARMOR), an effective unsupervised approach that leverages LLMs for selecting which dense retriever to use on a test corpus (target). Dense retriever selection is crucial for many IR applications that rely on using dense retrievers trained on public corpora to encode or search a new, private target corpus. This is because when confronted with domain shift, where the downstream corpora, domains, or tasks of the target corpus differ from the domain/task the dense retriever was trained on, its performance often drops. Furthermore, when the target corpus is unlabeled, e.g., in a zero-shot scenario, the direct evaluation of the model on the target corpus becomes unfeasible. Unsupervised selection of the most effective pre-trained dense retriever becomes then a crucial challenge. Current methods for dense retriever selection are insufficient in handling scenarios with domain shift. Our proposed solution leverages LLMs to generate pseudo-relevant queries, labels and reference lists based on a set of documents sampled from the target corpus. Dense retrievers are then ranked based on their effectiveness on these generated pseudo-relevant signals. Notably, our method is the first approach that relies solely on the target corpus, eliminating the need for both training corpora and test labels. To evaluate the effectiveness of our method, we construct a large pool of state-of-the-art dense retrievers. The proposed approach outperforms existing baselines with respect to both dense retriever selection and ranking. We make our code and results publicly available at https://github.com/ielab/larmor/.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
一只狗东西完成签到 ,获得积分10
2秒前
不安的白昼完成签到 ,获得积分10
2秒前
5秒前
XHS完成签到,获得积分10
5秒前
吾系渣渣辉完成签到 ,获得积分10
5秒前
张姣姣完成签到 ,获得积分10
5秒前
苹果小蜜蜂完成签到,获得积分10
6秒前
细心天德完成签到,获得积分10
8秒前
务实的绝悟完成签到,获得积分10
8秒前
孟惜儿完成签到,获得积分10
9秒前
anyeya发布了新的文献求助10
10秒前
12秒前
hang完成签到,获得积分10
13秒前
14秒前
jin_strive完成签到,获得积分10
15秒前
Kkyantong完成签到,获得积分10
16秒前
风趣尔琴发布了新的文献求助10
16秒前
HCT完成签到,获得积分10
17秒前
17秒前
雨过天晴发布了新的文献求助10
19秒前
20秒前
chen完成签到,获得积分0
20秒前
zly完成签到,获得积分10
21秒前
anitachiu1104发布了新的文献求助10
22秒前
唯心如意完成签到,获得积分10
22秒前
Peter_Zhu完成签到,获得积分10
23秒前
chrysan完成签到,获得积分10
25秒前
隐形荟完成签到 ,获得积分10
27秒前
27秒前
地表飞猪应助3719left采纳,获得10
28秒前
Son4904发布了新的文献求助10
32秒前
风趣尔琴完成签到,获得积分10
35秒前
35秒前
mmd完成签到 ,获得积分10
37秒前
JinlongFan完成签到 ,获得积分10
38秒前
16c发布了新的文献求助10
38秒前
金枪鱼完成签到,获得积分10
39秒前
海意完成签到,获得积分10
39秒前
高挑的若雁完成签到 ,获得积分10
43秒前
菠萝包完成签到 ,获得积分10
45秒前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
ISCN 2024 – An International System for Human Cytogenomic Nomenclature (2024) 3000
Continuum Thermodynamics and Material Modelling 2000
Encyclopedia of Geology (2nd Edition) 2000
105th Edition CRC Handbook of Chemistry and Physics 1600
Maneuvering of a Damaged Navy Combatant 650
Mindfulness and Character Strengths: A Practitioner's Guide to MBSP 380
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3776116
求助须知:如何正确求助?哪些是违规求助? 3321700
关于积分的说明 10206716
捐赠科研通 3036792
什么是DOI,文献DOI怎么找? 1666450
邀请新用户注册赠送积分活动 797459
科研通“疑难数据库(出版商)”最低求助积分说明 757841