辍学(神经网络)
人工智能
计算机科学
深度学习
机器学习
作者
Pravas Ranjan Bal,Sandeep Kumar
摘要
The primary goal of software defect prediction (SDP) is to predict the software defects for a specific software using historical data or data from past releases of software projects. The existing state of arts on SDP primarily discusses two prediction scenarios: Within Project Defect Prediction (WPDP) and Cross Project Defect Prediction (CPDP). The prediction model belongs to the WPDP scenario, which means that the model is trained and tested on different parts of the same dataset or trained on the dataset belonging to the previous version of the same project. While in the CPDP scenario, training and testing occur on different software project datasets. Due to the unavailability of historical datasets or prior releases of software defect datasets, CPDP is more useful in real-life scenarios. So, CPDP analysis is a very challenging issue in the SDP domain. Sometimes, machine learning (ML) models perform poorly due to inadequate training in the CPDP scenario. To support better CPDP performance, we must carefully build an ML model focusing on lower training error and overfitting issues. To address these issues, we have proposed a cross-project data preprocessing method to correlate the metrics of different project datasets, namely Unique Selection of Matched Metrics (USMM), using the KS test and Hungarian method. To further improve the CPDP performance, we have also used the dropout regularized deep learning (DRDL) model. We have deployed 34 software defect datasets to validate the DRDL model and USMM method. The experimental results demonstrate that the DRDL model using the USMM method (DRDL-USMM) is a promising model to enhance the prediction accuracy, and an improvement in the range of 3.3% to 8.5% as compared to the existing works in the CPDP scenario has been found.
科研通智能强力驱动
Strongly Powered by AbleSci AI