稳健性(进化)
计算机科学
训练集
调试
机器学习
对抗制
人工智能
二元分类
二进制数
标记数据
数据挖掘
支持向量机
数学
算术
生物化学
化学
基因
程序设计语言
作者
Wenfei Fan,Xiaoyu Han,Weilong Ren,Zihuan Xu
摘要
This paper studies enhancement of training data D to improve the robustness of machine learning (ML) classifiers M against adversarial attacks on relational data. Data enhancing aims to (a) defuse poisoned imperceptible features embedded in D , and (b) defend against attacks at prediction time that are unseen in D . We show that while there exists an inherent tradeoff between the accuracy and robustness of M in case (b), data enhancing can improve both the accuracy and robustness at the same time in case (a). We formulate two data enhancing problems accordingly, and show that both problems are intractable.Despite the hardness, we propose a framework that integrates model training and data enhancing. Moreover, we develop algorithms for (a) detecting and debugging corrupted imperceptible features in training data, and (b) selecting and adding adversarial examples to training data to defend against unseen attacks at prediction time. Using real-life datasets, we empirically verify that the method is at least 20.4% more robust and 2.02X faster than SOTA methods for classifiers M , without degrading the accuracy of M .
科研通智能强力驱动
Strongly Powered by AbleSci AI