Perception, approach, and grasping of underwater pipes by an autonomous robot using monocular vision
DOI:
https://doi.org/10.17979/ja-cea.2025.46.12227Keywords:
Marine system navigation and control, Autonomous underwater vehicles, Perception and sensing, Motion planning, Image processingAbstract
This work presents a complete system for the perception, approach, and grasping of a pipe in an underwater environment,
using a robot equipped solely with a monocular camera as its visual sensor. The absence of depth sensors introduces an additional
challenge, as all spatial information must be obtained from 2D images, increasing the complexity of both perception and motion
planning. The detection and segmentation of the pipe are performed using a YOLOv8 model specifically trained for this type of
environment. Based on the segmented image, both the geometric features of the pipe and the grasping points are computed. This
information enables the robot to position itself correctly in front of the pipe and perform the grasp using a simple gripper. The
system was developed in ROS Noetic, and several tests have been conducted in different scenarios: the Stonefish simulator, the
CIRTESU test tank, and real-world conditions in the Port of Castell´on.
References
Blue Robotics, 2016. Bluerov2: The world’s most affordable high-performance rov. BlueROV2 datasheet, revised May 2025. URL: https://bluerobotics.com/store/rov/bluerov2/bluerov2/
Gonzalez, R. C., Wintz, P., 1987. Digital Image Processing, 2nd Edition. Addison-Wesley, Reading, Massachusetts.
Jain, A. K., 1989. Fundamentals of Digital Image Processing. Prentice Hall, Englewood Cliffs, NJ.
López-Barajas, S., Sanz, P. J., Marín-Prades, R., Echagüe, J., Realpe, S., 2025. Network congestion control algorithm for image transmission—hri and visual light communications of an autonomous underwater vehicle for intervention. Future Internet 17 (1). URL: https://www.mdpi.com/1999-5903/17/1/10 DOI: 10.3390/fi17010010
Open Source Robotics Foundation, 2020. Robot Operating System (ROS) – Noetic Ninjemys. https://www.ros.org, released May 23, 2020. URL: https://www.ros.org
OpenCV Team, 2024. Image moments - opencv documentation. https://docs.opencv.org/4.x/d0/d49/tutorial_moments.html.
Pi, R., Ciéslak, P., Ridao, P., Sanz, P. J., 2021. Twinbot: Autonomous underwater cooperative transportation. IEEE Access 9, 37668–37684. DOI: 10.1109/ACCESS.2021.3063669
Raviv, D., Herman, M., 1993. Visual servoing from 2-d image cues. In: Aloimonos, Y. (Ed.), Active Perception. Lawrence Erlbaum Associates.
Rocco, D. F., 2016. Plotjuggler: Real-time data visualization tool. https://plotjuggler.io/.
Torralba, A., Isola, P., Freeman, W., 2024. Foundations of Computer Vision. Adaptive Computation and Machine Learning series. MIT Press. URL: https://mitpress.mit.edu/9780262048972/foundations-of-computer-vision/
Ultralytics, 2023. Yolov8: State-of-the-art real-time object detection and segmentation. https://github.com/ultralytics/ultralytics.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Inés Pérez Edo, Salvador López Barajas, Raúl Marín Prades, Andrea Pino Jarque, Alejandro Solís Jiménez, Pedro José Sanz Valero

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.