计算机科学
班级(哲学)
构造(python库)
人工智能
阶段(地层学)
软件
混淆矩阵
可靠性(半导体)
领域(数学)
数据挖掘
机器学习
数学
量子力学
纯数学
生物
程序设计语言
古生物学
功率(物理)
物理
作者
Jiaying Li,Yan Lei,Shanshan Li,Haifang Zhou,Yue Yu,Zhouyang Jia,Yingwei Ma,Teng Wang
标识
DOI:10.1109/issre59848.2023.00070
摘要
Classification tasks are prevalent and play a crucial role in the field of software engineering. However, when two classes exhibit similar features at the class level, the classification model is prone to misclassification, which we refer to as ambiguous classification, and the corresponding classes as ambiguous classes. Ambiguous classification may impact the security and reliability of software engineering classification systems.To correct ambiguous classification, we propose a two-stage framework. Our key insight is to combine two different classification models and utilize their complementary knowledge to maximize the classification ability of the two-stage framework. Specifically, we identify ambiguous classes according to the confusion matrix of the original model. Then, we construct a two-stage model, where the first stage utilizes the original model and the second stage utilizes a different model trained on the same dataset. The second-stage model is responsible for reclassifying the samples that are predicted as ambiguous classes by the first-stage model. We evaluate our method on two software engineering tasks. Experimental results indicate that our method can effectively correct ambiguous classification and achieve a relative improvement of 19.8% in F1-score for ambiguous classes.
科研通智能强力驱动
Strongly Powered by AbleSci AI