排名(信息检索)
计算机科学
公制(单位)
任务(项目管理)
相似性(几何)
人工智能
机器学习
学习排名
数据挖掘
软件错误
二元分类
情报检索
深度学习
实证研究
支持向量机
软件
图像(数学)
工程类
数学
统计
程序设计语言
系统工程
运营管理
作者
Yuan Jiang,Xin Su,Christoph Treude,Shang Chen,Tiantian Wang
标识
DOI:10.1016/j.jss.2023.111607
摘要
Do Deep Learning (DL) techniques actually help to improve the performance of duplicate bug report detection? Prior studies suggest that they do, if the duplicate bug report detection task is treated as a binary classification problem. However, in realistic scenarios, the task is often viewed as a ranking problem, which predicts potential duplicate bug reports by ranking based on similarities with existing historical bug reports. There is little empirical evidence to support that DL can be effectively applied to detect duplicate bug reports in the ranking scenario. Therefore, in this paper, we investigate whether well-known DL-based methods outperform classic information retrieval (IR) based methods on the duplicate bug report detection task. In addition, we argue that both IR- and DL-based methods suffer from incompletely evaluating the similarity between bug reports, resulting in the loss of important information. To address this problem, we propose a new method that combines IR and DL techniques to compute textual similarity more comprehensively. Our experimental results show that the DL-based method itself does not yield high performance compared to IR-based methods. However, our proposed combined method improves on the MAP metric of classic IR-based methods by a median of 7.09%–11.34% and a maximum of 17.228%–28.97%.
科研通智能强力驱动
Strongly Powered by AbleSci AI