投标
强化学习
计算机科学
鉴定(生物学)
马尔可夫决策过程
人工智能
功能(生物学)
电力市场
机器学习
数学优化
数据挖掘
运筹学
马尔可夫过程
电
工程类
经济
微观经济学
数学
生物
进化生物学
统计
植物
电气工程
作者
Hongye Guo,Qixin Chen,Qing Xia,Chongqing Kang
标识
DOI:10.1109/tpwrs.2021.3076296
摘要
Due to the deregulation of power systems worldwide, bidding behavior simulation research has gained prominence. One crucial element in these studies is accurately defining and modelling the individual reward function (or objective function). Considering the ubiquitous information barriers between market participants and researchers, the common way is to develop reward functions based on theoretical assumptions, which will inevitably cause deviations from the real world. However, since market data have gradually become transparent in recent years, especially data regarding historical bidding behaviors, it is feasible to introduce data-driven methods to identify the individual reward functions that are hidden in raw bidding data. Thus, this paper proposes a data-driven bidding objective function identification framework with three procedures. First, the bidding decision processes of participants are formulated as a standard Markov decision process. Second, a deep inverse reinforcement learning method that is based on maximum entropy is introduced to identify individual reward functions, whose high-dimensional nonlinearity could be saved in multilayer perceptions (MLPs). Third, a deep Q-network method is customized to simulate the individual bidding behaviors based on the obtained MLP-based objective functions. The effectiveness and feasibility of the proposed framework and methods are tested based on real market data from the Australian electricity market.
科研通智能强力驱动
Strongly Powered by AbleSci AI