马尔可夫决策过程
数学优化
经济调度
强化学习
计算机科学
解算器
电力系统
最优控制
熵最大化
最大化
马尔可夫过程
功率(物理)
最大熵原理
数学
人工智能
统计
物理
量子力学
作者
Ahmed Rabee Sayed,Cheng Wang,H. Anis,Tianshu Bi
标识
DOI:10.1109/tpwrs.2022.3220799
摘要
Due to the increasing uncertainties of renewable energy and stochastic demands, quick-optimal control actions are necessary to retain the system stability and economic operation. Existing optimal power flow (OPF) solution methods need to be enhanced to guarantee the solution optimality and feasibility in real-time operation under such uncertainties. This paper proposes a convex constrained soft actor-critic (CC-SAC) deep reinforcement learning (DRL) algorithm for the AC-OPF problem. First, this problem is standardized as a Markov decision process model to be solved by DRL algorithms. Second, the operational constraints are satisfied by a novel convex safety layer based on the penalty convex-concave procedure (P-CCP). Then, the control policy is updated by the state-of-the-art off-policy entropy maximization-based SAC algorithm. Therefore, the CC-SAC is a combination of data-driven and physics-driven approaches. The former speedups the solution time by predicting near-optimum control actions through a deep neural network. The latter effectively guarantees the solution feasibility. Simulation results demonstrate the computational performance of the proposed CC-SAC to effectively find AC-OPF decisions with no constraint violation, zero optimality gap and high speed up to 34 times compared to a state-of-the-art solver. The proposed approach indicates its practicability for power system real-time operation and marketing.
科研通智能强力驱动
Strongly Powered by AbleSci AI