Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning

arXiv:2606.08513v1 Announce Type: cross Abstract: Autonomous Underwater Vehicles (AUVs) traditionally rely on complex, heavily engineered pipelines for perception, path planning, and motion control. This paper explores the feasibility of an end-to-end Deep Reinforcement Learning (DRL) approach that maps raw sensor data directly to thruster commands, reducing manual engineering. We propose a hierarchical reinforcement learning (HRL) architecture splitting the problem into two Markov Decision Processes. A High-Level (HL) policy operating at 2Hz processes raw $84 \times 84$ pixel monocular camera
The increasing sophistication of autonomous systems and the need for more adaptable and less human-dependent remote operations are driving the move towards end-to-end learning in complex environments.
This development represents a significant step towards fully autonomous systems that can operate in unpredictable and dangerous environments with minimal human oversight, reducing operational costs and risks.
Traditional modular approaches for AUV control, involving separate perception, planning, and control, are being challenged by end-to-end DRL which maps raw sensor data directly to actions, potentially leading to more robust and adaptable systems.
- · Defence contractors
- · Ocean exploration companies
- · AI software developers
- · Robotics hardware manufacturers
- · Traditional AUV control system integrators (if they don't adapt)
- · Human-operated underwater inspection services
More resilient and autonomous underwater vehicles with reduced need for human intervention will emerge as a direct consequence.
This will enable new applications in deep-sea exploration, persistent surveillance, and offshore infrastructure maintenance that are currently too risky or expensive.
The success of this end-to-end approach in AUVs could accelerate similar 'raw-data-to-action' paradigms in other complex robotic domains like aerial vehicles or ground robotics, further collapsing traditional engineering layers.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG