可解释性
计算机科学
公制(单位)
集合(抽象数据类型)
数据挖掘
领域(数学)
软件度量
软件
实证研究
机器学习
软件错误
预测建模
选择(遗传算法)
班级(哲学)
性能指标
转化(遗传学)
人工智能
软件开发
软件质量
程序设计语言
统计
数学
纯数学
管理
化学
运营管理
经济
基因
生物化学
作者
Haowen Chen,Xiao‐Yuan Jing,Zhiqiang Li,Di Wu,Peng Yi,Zhiguo Huang
标识
DOI:10.1109/tse.2020.2968520
摘要
Software defect prediction has always been a hot research topic in the field of software engineering owing to its capability of allocating limited resources reasonably. Compared with cross-project defect prediction (CPDP), heterogeneous defect prediction (HDP) further relaxes the limitation of defect data used for prediction, permitting different metric sets to be contained in the source and target projects. However, there is still a lack of a holistic understanding of existing HDP studies due to different evaluation strategies and experimental settings. In this paper, we provide an empirical study on HDP approaches. We review the research status systematically and compare the HDP approaches proposed from 2014 to June 2018. Furthermore, we also investigate the feasibility of HDP approaches in CPDP. Through extensive experiments on 30 projects from five datasets, we have the following findings: (1) metric transformation-based HDP approaches usually result in better prediction effects, while metric selection-based approaches have better interpretability. Overall, the HDP approach proposed by Li et al. (CTKCCA) currently has the best performance. (2) Handling class imbalance problems can boost the prediction effects, but the improvements are usually limited. In addition, utilizing mixed project data cannot improve the performance of HDP approaches consistently since the label information in the target project is not used effectively. (3) HDP approaches are feasible for cross-project defect prediction in which the source and target projects have the same metric set.
科研通智能强力驱动
Strongly Powered by AbleSci AI