可解释性
强化学习
人工智能
计算机科学
黑匣子
过程(计算)
机器学习
对抗制
深度学习
操作系统
作者
Yan Xie,Soroush Vosoughi,Saeed Hassanpour
标识
DOI:10.1109/icpr56361.2022.9956245
摘要
Artificial Intelligence, particularly through recent advancements in deep learning (DL), has achieved exceptional performances in many tasks in fields such as natural language processing and computer vision. For certain high-stake domains, in addition to desirable performance metrics, a high level of interpretability is often required in order for AI to be reliably utilized. Unfortunately, the black box nature of DL models prevents researchers from providing explicative descriptions for a DL model’s reasoning process and decisions. In this work, we propose a novel framework utilizing Adversarial Inverse Reinforcement Learning that can provide global explanations for decisions made by a Reinforcement Learning model and capture intuitive tendencies that the model follows by summarizing the model’s decision-making process.
科研通智能强力驱动
Strongly Powered by AbleSci AI