强化学习
马尔可夫决策过程
计算机科学
数学优化
背景(考古学)
部分可观测马尔可夫决策过程
马尔可夫链
马尔可夫过程
工程设计过程
过程(计算)
机器学习
马尔可夫模型
数学
工程类
机械工程
操作系统
古生物学
统计
生物
作者
Maximilian E. Ororbia,Gordon P. Warn
摘要
Abstract This article presents a framework that mathematically models optimal design synthesis as a Markov Decision Process (MDP) that is solved with reinforcement learning. In this context, the states correspond to specific design configurations, the actions correspond to the available alterations modeled after generative design grammars, and the immediate rewards are constructed to be related to the improvement in the altered configuration’s performance with respect to the design objective. Since in the context of optimal design synthesis the immediate rewards are in general not known at the onset of the process, reinforcement learning is employed to efficiently solve the MDP. The goal of the reinforcement learning agent is to maximize the cumulative rewards and hence synthesize the best performing or optimal design. The framework is demonstrated for the optimization of planar trusses with binary cross-sectional areas, and its utility is investigated with four numerical examples, each with a unique combination of domain, constraint, and external force(s) considering both linear-elastic and elastic-plastic material behaviors. The design solutions obtained with the framework are also compared with other methods in order to demonstrate its efficiency and accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI