推荐系统
匹配(统计)
计算机科学
情报检索
人工智能
数据挖掘
数学
统计
作者
Zhen Li,Jibin Wang,Zhuo Chen,Kun‐Shan Wu,Liang Liu,Meng Ai,Li Liu
摘要
Ranking is a core task in recommender systems, and various ranking metrics are designed to assess the quality of ordered item lists. A reliable evaluation method is crucial for identifying optimal recommendation algorithms, directly impacting downstream tasks and platform revenue. However, in industrial application scenarios, users always select preferred items for rating, leading to biased ranking metric evaluations. Recent works propose using pseudo-labeling and reweighting for debiasing and thereby achieving unbiased evaluation. Despite being theoretically promising, the collected biased feedback makes the pseudo-labeling directly rely on extrapolation, and the propensity-based weighting method has a large variance in the presence of small propensities due to data sparsity commonly exists in real-world scenarios, both of them lack stability. In this paper, we propose a novel Doubly Robust Matching for Recommendation (DRM4Rec) method to achieve unbiased ranking metric evaluation. Compared to existing approaches, DRM4Rec reduces the unavoidable high variance due to the small propensities and also mitigates the direct harm to prediction performance from incorrect extrapolations. In addition, the proposed method has double robustness--- it achieves unbiased ranking metric evaluation when either the imputed relevance or the learned propensities are accurate. We conduct extensive semi-synthetic and real-world experiments to evaluate three representative recommendation models, and the results show that DRM4Rec provides significant improvements for unbiased ranking metrics evaluations.
科研通智能强力驱动
Strongly Powered by AbleSci AI