强化学习
计算机科学
马尔可夫决策过程
数据库事务
需求响应
动态定价
过程(计算)
交易策略
点对点
网格
电
智能电网
数学优化
分布式计算
马尔可夫过程
人工智能
工程类
微观经济学
业务
经济
统计
几何学
数学
电气工程
财务
程序设计语言
操作系统
作者
Yiqun Wang,Qingyu Yang,Donghe Li
摘要
With the massive access to distributed energy resources, an increasing number of users have transformed into prosumers with the functions of producing, storing, and consuming electric energy. Peer-to-peer (P2P) energy trading, as a new way to allow direct energy transactions between prosumers, is becoming increasingly widespread. How to determine the trading strategy of prosumers participating in P2P energy trading while the strategy can satisfy multiple optimization objectives simultaneously is a crucial problem to be solved. To this end, this paper introduces the demand response mechanism and applies the dissatisfaction function to represent the electricity consumption of prosumers. The mid-market rate price is adopted to attract more prosumers to participate in P2P energy trading. The P2P energy trading process among multiple prosumers in the community is constructed as a Markov decision process. We design the method of deep reinforcement learning (DRL) to solve the optimal trading policy of prosumers. DRL, by engaging in continual interactions with the environment, autonomously learns the optimal strategies. Additionally, the deep deterministic policy gradient algorithm is well-suited for handling the continuous and intricate decision problems that arise in the P2P energy trading market. Through the judicious construction of a reinforcement learning environment, this paper achieves multi-objective collaborative optimization. Simulation results show that our proposed algorithm and model reduce costs by 16.5%, compared to the transaction between prosumers and grid, and can effectively decrease the dependence of prosumers on the main grid.
科研通智能强力驱动
Strongly Powered by AbleSci AI