Curiosity Algorithm Enhances Autonomous Navigation

Intelligent Computing

Self-driving cars know their own way in unpredictable traffic, thanks to path planning technology. Among current AI-driven efforts to make path planning more efficient and reliable, a research team has developed an optimization method proven especially effective in uncertain environments. The results were published June 3 under the title "Action-Curiosity-Based Deep Reinforcement Learning Algorithm for Path Planning in a Nondeterministic Environment" in Intelligent Computing , a Science Partner Journal.

The team evaluated their method in a realistic simulation platform using the TurtleBot3 Waffle robot equipped with 360° LiDAR sensors. They tested four distinct scenarios, ranging from simple static obstacle courses to highly complex scenarios with dynamic, unpredictably moving obstacles. Compared with several state-of-the-art baseline algorithms, their method showed remarkable improvements across key metrics, including convergence speed, training duration, path planning success rate, and average reward.

This method is based on deep reinforcement learning, which enables agents to learn optimal behaviors through real-time interaction with dynamic environment but often suffers from slow convergence and low learning efficiency. To address this challenge, the team designed and integrated an action curiosity module into the framework. This module allows intelligent agents—in their study, robots—to learn more efficiently and obtain rewards while satisfying their curiosity through extensive exploration.

The action curiosity module encourages the agent to focus on states of moderate difficulty, striking a balance between exploring completely novel states and exploiting known rewarding behaviors. The module extends the traditional intrinsic curiosity module by incorporating an obstacle perception prediction network. The prediction network dynamically calculates curiosity rewards based on prediction errors related to obstacles, directing the agent's attention to states that maximize both learning and exploration efficiency.

To prevent performance degradation that can occur from excessive exploration in later training stages, the team also used a cosine annealing strategy. This strategy gradually adjusts the weight of curiosity rewards, stabilizing the training process and enabling reliable convergence of the learned policy.

Looking forward, the research team plans to enhance their method by integrating motion prediction techniques. This next step aims to improve adaptability to highly dynamic and stochastic environments, paving the way for more robust and practical applications in real-world autonomous driving.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.