清晨好,您是今天最早来到科研通的研友!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您科研之路漫漫前行!

Training effective deep reinforcement learning agents for real-time life-cycle production optimization

强化学习 马尔可夫决策过程 数学优化 计算机科学 增强学习 时间范围 最优控制 生产(经济) 贝尔曼方程 动态规划 任务(项目管理) 人工智能 马尔可夫过程 工程类 数学 统计 宏观经济学 经济 系统工程
作者
Kai Zhang,Zhongzheng Wang,Guodong Chen,Liming Zhang,Yongfei Yang,Chuanjin Yao,Jian Wang,Jun Yao
出处
期刊:Journal of Petroleum Science and Engineering [Elsevier BV]
卷期号:208: 109766-109766 被引量:143
标识
DOI:10.1016/j.petrol.2021.109766
摘要

Life-cycle production optimization aims to obtain the optimal well control scheme at each time control step to maximize financial profit and hydrocarbon production. However, searching for the optimal policy under the limited number of simulation evaluations is a challenging task. In this paper, a novel production optimization method is presented, which maximizes the net present value (NPV) over the entire life-cycle and achieves real-time well control scheme adjustment. The proposed method models the life-cycle production optimization problem as a finite-horizon Markov decision process (MDP), where the well control scheme can be viewed as sequence decisions. Soft actor-critic, known as the state-of-the-art model-free deep reinforcement learning (DRL) algorithm, is subsequently utilized to train DRL agents that can solve the above MDP. The DRL agent strives to maximize long-term NPV rewards as well as the control scheme randomness by training a stochastic policy that maps reservoir states to well control variables and an action-value function that estimates the objective value of the current policy. Since the trained policy is an explicit function structure, the DRL agent can adjust the well control scheme in real-time under different reservoir states. Different from most existing methods that introduce task-specific sensitive parameters or construct complex supplementary structures, the DRL agent learns adaptively by executing goal-directed interactions with an uncertain reservoir environment and making use of accumulated well control experience, which is similar to the actual field well control mode. The key insight here is that the DRL method's ability to utilize gradients information (well-control experience) for higher sample efficiency. The simulation results based on two reservoir models indicate that compared to other optimization methods, the proposed method can attain higher NPV and access excellent performance in terms of oil displacement. • A novel production optimization framework that incorporating advanced deep reinforcement leaning technologies is presented. • The proposed method models the life-cycle production optimization problem as a finite-horizon Markov decision process. • The trained policy is an explicit function structure that utilizing powerful gradient information for higher sample efficiency. • The proposed method achieves excellent performance on one classic control task and two reservoir models.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Droplet完成签到,获得积分10
18秒前
一人独钓一江秋完成签到,获得积分10
33秒前
马仔猴完成签到 ,获得积分10
1分钟前
1分钟前
欣欣发布了新的文献求助10
1分钟前
姚老表完成签到,获得积分10
1分钟前
SciGPT应助欣欣采纳,获得10
1分钟前
科研通AI2S应助科研通管家采纳,获得10
1分钟前
9527完成签到,获得积分10
1分钟前
555完成签到,获得积分10
1分钟前
大气思柔完成签到 ,获得积分10
2分钟前
卜哥完成签到 ,获得积分10
2分钟前
紫熊完成签到,获得积分10
3分钟前
蛋卷完成签到 ,获得积分10
3分钟前
可靠花生完成签到,获得积分10
3分钟前
慎二完成签到 ,获得积分10
3分钟前
3分钟前
糟糕的翅膀完成签到,获得积分10
4分钟前
Hiram完成签到,获得积分0
4分钟前
机智的苗条完成签到,获得积分10
4分钟前
成就的香菇完成签到,获得积分10
4分钟前
鸡鸡大魔王完成签到,获得积分10
4分钟前
喜悦的唇彩完成签到,获得积分10
4分钟前
雪山飞龙完成签到,获得积分10
4分钟前
羞涩的问兰完成签到,获得积分10
4分钟前
丰富的亦寒完成签到,获得积分10
4分钟前
标致初曼完成签到,获得积分10
4分钟前
哈哈哈完成签到,获得积分10
4分钟前
luo完成签到,获得积分10
4分钟前
4分钟前
螺丝炒钉子完成签到,获得积分10
4分钟前
zhaojiaxu完成签到,获得积分10
4分钟前
zhaojiaxu发布了新的文献求助10
4分钟前
5分钟前
平淡擎汉完成签到,获得积分20
5分钟前
胡萝卜完成签到,获得积分10
5分钟前
平淡擎汉发布了新的文献求助10
5分钟前
5分钟前
白泽发布了新的文献求助10
5分钟前
汉堡包应助白泽采纳,获得10
6分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
The Cambridge History of China: Volume 4, Sui and T'ang China, 589–906 AD, Part Two 1500
Cowries - A Guide to the Gastropod Family Cypraeidae 1200
Quality by Design - An Indispensable Approach to Accelerate Biopharmaceutical Product Development 800
Pulse width control of a 3-phase inverter with non sinusoidal phase voltages 777
Signals, Systems, and Signal Processing 610
Research Methods for Applied Linguistics: A Practical Guide 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6399350
求助须知:如何正确求助?哪些是违规求助? 8215393
关于积分的说明 17407717
捐赠科研通 5452686
什么是DOI,文献DOI怎么找? 2881881
邀请新用户注册赠送积分活动 1858293
关于科研通互助平台的介绍 1700326