强化学习
计算机科学
人工智能
集成学习
软件
特征(语言学)
特征提取
机器学习
萃取(化学)
模式识别(心理学)
程序设计语言
化学
色谱法
哲学
语言学
作者
Mohsen Hesamolhokama,Amirahmad Shafiee,Mohammadreza Ahmadi Teshnizi,Mohammadamin Fazli Jafar Habibi
出处
期刊:Cornell University - arXiv
日期:2024-12-10
标识
DOI:10.48550/arxiv.2412.07927
摘要
Ensuring software quality remains a critical challenge in complex and dynamic development environments, where software defects can result in significant operational and financial risks. This paper proposes an innovative framework for software defect prediction that combines ensemble feature extraction with reinforcement learning (RL)--based feature selection. We claim that this work is among the first in recent efforts to address this challenge at the file-level granularity. The framework extracts diverse semantic and structural features from source code using five code-specific pre-trained models. Feature selection is enhanced through a custom-defined embedding space tailored to represent feature interactions, coupled with a pheromone table mechanism inspired by Ant Colony Optimization (ACO) to guide the RL agent effectively. Using the Proximal Policy Optimization (PPO) algorithm, the proposed method dynamically identifies the most predictive features for defect detection. Experimental evaluations conducted on the PROMISE dataset highlight the framework's superior performance on the F1-Score metric, achieving an average improvement of $6.25\%$ over traditional methods and baseline models across diverse datasets. This study underscores the potential for integrating ensemble learning and RL for adaptive and scalable defect prediction in modern software systems.
科研通智能强力驱动
Strongly Powered by AbleSci AI