计算机科学
机器学习
培训(气象学)
人工智能
训练集
心理学
物理
气象学
作者
Riddhi More,Jeremy S. Bradbury
出处
期刊:Cornell University - arXiv
日期:2025-02-03
被引量:1
标识
DOI:10.48550/arxiv.2502.01825
摘要
Data augmentation has become a standard practice in software engineering to address limited or imbalanced data sets, particularly in specialized domains like test classification and bug detection where data can be scarce. Although techniques such as SMOTE and mutation-based augmentation are widely used in software testing and debugging applications, a rigorous understanding of how augmented training data impacts model bias is lacking. It is especially critical to consider bias in scenarios where augmented data sets are used not just in training but also in testing models. Through a comprehensive case study of flaky test classification, we demonstrate how to test for bias and understand the impact that the inclusion of augmented samples in testing sets can have on model evaluation.
科研通智能强力驱动
Strongly Powered by AbleSci AI