发布文献求助

A substructure transfer reinforcement learning method based on metric learning

强化学习学习迁移下部结构计算机科学公制（单位）人工智能钢筋机器学习材料科学工程类结构工程复合材料运营管理

作者

Peihua Chai,Bilian Chen,Yifeng Zeng,Shenbao Yu

出处

期刊：Neurocomputing [Elsevier BV]
日期：2024-06-15 卷期号：598: 128071-128071

标识

DOI：10.1016/j.neucom.2024.128071

摘要

Transfer reinforcement learning has gained significant traction in recent years as a critical research area, focusing on bolstering agents' decision-making prowess by harnessing insights from analogous tasks. The primary transfer learning method involves identifying the appropriate source domains, sharing specific knowledge structures and subsequently transferring the shared knowledge to novel tasks. However, existing transfer methods exhibit a pronounced dependency on high task similarity and an abundance of source data. Consequently, we attempt to formulate a more efficacious approach that optimally exploits the previous learning experiences to direct an agent's exploration as it learns new tasks. Specifically, we introduce a novel transfer learning paradigm rooted within the distance measure in the Markov chain, denoted as Distance Measure Substructure Transfer Reinforcement Learning (DMS-TRL). The core idea involves partitioning the Markov chain into the most basic small Markov units, which contain basic information about the agent's transfer between two states, and then followed by employing a new distance measure technique to find the most similar structure, which is also the most suitable for transfer. Finally, we propose a policy transfer method to transfer knowledge through the Q table from the selected Markov unit to the target task. Through a series of experiments conducted on discrete Gridworld scenarios, we compare our approach with state-of-the-art learning methods. The results clearly illustrate that DMS-TRL can adeptly identify optimal policy in target tasks, exhibiting swifter convergence.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

活动

『应助活动周』获奖名单已公布 🔥 (2025-4-2)

更新

『中科院2025期刊分区』已更新 (2025-3-23)

更新

『即时热点』模块已上线 (2025-2-28)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: LY完成签到，获得积分10

1秒前; ALLon完成签到，获得积分10

2秒前; xiao柒柒柒完成签到，获得积分10

2秒前; 日月星完成签到，获得积分10

3秒前; 是小雨呀完成签到，获得积分10

4秒前; 乐乐乐完成签到，获得积分10

4秒前; qqqyy完成签到，获得积分10

5秒前; 从容鞋子完成签到，获得积分10

5秒前; 科研通AI5上传了应助文件

5秒前; 俭朴的世界完成签到，获得积分10

6秒前; 苏钰完成签到，获得积分10

6秒前; 大道要熬发布了新的文献求助10

7秒前; 洁净的天德完成签到，获得积分10

8秒前; cp3xzh完成签到，获得积分10

9秒前; 梁帅琦完成签到，获得积分20

9秒前; 苗条的小蜜蜂完成签到，获得积分10

10秒前; 517完成签到，获得积分10

10秒前; Mr.Su完成签到，获得积分10

11秒前; 梁帅琦发布了新的文献求助10

12秒前; cdercder的应助被尛瞐慶成采纳，获得10

12秒前; 4645完成签到，获得积分10

13秒前; 凝视的应助被Sylvia采纳，获得10

14秒前; 怡然猎豹完成签到，获得积分10

15秒前; 烯灯完成签到，获得积分10

16秒前; hhm完成签到，获得积分10

16秒前; 潜山耕之完成签到，获得积分10

16秒前; 宇文青寒完成签到，获得积分10

16秒前; 天天快乐上传了应助文件

17秒前; Azhou完成签到，获得积分10

18秒前; xian完成签到，获得积分10

18秒前; 耸耸完成签到，获得积分10

19秒前; 666完成签到，获得积分10

19秒前; 具体问题具体分析完成签到，获得积分10

20秒前; 拾石子完成签到，获得积分10

20秒前; wangjq完成签到，获得积分10

21秒前; 欲望被鬼完成签到，获得积分10

21秒前; Who1990完成签到，获得积分10

21秒前; tiankong完成签到，获得积分10

22秒前; 十三发布了新的文献求助10

22秒前; 清茶旧友完成签到，获得积分10

23秒前

高分求助中: Mass producing individuality 600; Algorithmic Mathematics in Machine Learning 500; Разработка метода ускоренного контроля качества электрохромных устройств 500; A Combined Chronic Toxicity and Carcinogenicity Study of ε-Polylysine in the Rat 400; Advances in Underwater Acoustics, Structural Acoustics, and Computational Methodologies 300; The Power of High-Throughput Experimentation: General Topics and Enabling Technologies for Synthesis and Catalysis (Volume 1) 200; NK Cell Receptors: Advances in Cell Biology and Immunology by Colton Williams (Editor) 200

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3827399; 求助须知：如何正确求助？哪些是违规求助？ 3369731; 关于积分的说明 10457038; 捐赠科研通 3089413; 什么是DOI，文献DOI怎么找？ 1699854; 邀请新用户注册赠送积分活动 817542; 科研通“疑难数据库（出版商）”最低求助积分说明 770253

今日热心研友

可千万不要躺平呀

遇上就这样吧

无情的君浩

jenningseastera

请叫我风吹麦浪

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通