Dictionary Learning-Structured Reinforcement Learning With Adaptive-Sparsity Regularizer

强化学习计算机科学人工智能适应性学习机器学习模式识别（心理学）

作者

Zhenni Li,Jianhao Tang,Haoli Zhao,Ci Chen,Shengli Xie

出处

期刊：IEEE Transactions on Aerospace and Electronic Systems [Institute of Electrical and Electronics Engineers]
日期：2023-12-20 卷期号：60 (2): 1753-1769 被引量：2

标识

DOI：10.1109/taes.2023.3342794

摘要

Deep reinforcement learning (DRL) has been applied to satellite navigation and positioning applications. Its performance relies heavily on the function-approximation capability of deep neural networks. However, existing DRL models suffer from catastrophic interference, resulting in inaccurate function approximation. The sparse-coding-based DRL is an effective method to mitigating this interference, but existing methods involve the following two challenging issues: first, the value function estimation network suffers from instability problems with gradient backpropagation, including gradient explosion and gradient vanishing, second, existing methods are limited to using hand-crafted sparse regularizers that produce only static sparsity, which may be difficult to apply in various dynamic reinforcement learning (RL) environments. In this article, we propose a novel dictionary learning (DL)-structured RL model with adaptive-sparsity regularizer (ASR) that alleviates the catastrophic interference and enables accurate value function approximation, thereby improving the RL performance. To alleviate the interference and avoid the instability problems in RL, a feedforward DL-structured RL model is constructed to predict the value function without the need for gradient backpropagation. To learn data-driven sparse representations with adaptive sparsity, we propose to use the learnable sparse regularizer ASR in the model, where the key hyperparameters of ASR can be trained to be adaptive to variable RL environments. To optimize the model efficiently, the model parameters are first pretrained in the pretraining stage, with only the value weights used for value function approximation needing to be fine-tuned for actual RL applications in the control training stage. Our comparative experiments in benchmark environments demonstrate that the proposed method can outperform existing state-of-the-art sparse-coding-based RL algorithms. In terms of accumulated rewards (used to measure the quality of the learned policy), the improvement was over 63% in Cart Pole environment and nearly 10% for Puddle World. Furthermore, the proposed algorithm can maintain its relatively high performance in the presence of noise up to 20 dB.

求助该文献

最长约 10秒，即可获得该文献文件

Dictionary Learning-Structured Reinforcement Learning With Adaptive-Sparsity Regularizer

今日热心研友