强化学习
钢筋
计算机科学
航空学
人工智能
心理学
工程类
社会心理学
作者
Xiao Wei,Zhaohan Feng,Ziyu Zhou,Jian Sun,Gang Wang,Jie Chen
标识
DOI:10.1142/s230138502650007x
摘要
This paper addresses the problem of guiding a quadrotor through a predefined sequence of waypoints in cluttered environments, aiming to minimize the flight time while avoiding collisions. Previous approaches either suffer from prolonged computational time caused by solving complex non-convex optimization problems or are limited by the inherent smoothness of polynomial trajectory representations, thereby restricting the flexibility of movement. In this work, we present a safe reinforcement learning approach for autonomous drone racing with time-optimal flight in cluttered environments. The reinforcement learning policy, trained using safety and terminal rewards specifically designed to enforce near time-optimal and collision-free flight, outperforms current state-of-the-art algorithms. Additionally, experimental results demonstrate the efficacy of the proposed approach in achieving both minimum flight time and obstacle avoidance objectives in complex environments, with a commendable [Formula: see text] success rate in unseen, challenging settings.
科研通智能强力驱动
Strongly Powered by AbleSci AI