Climate variations and human activities have changed most of the coastal areas very rapidly, which demands the deployment of efficient and robust environmental monitoring systems. Autonomous underwater/surface robots are a very promising approach in acquiring water properties in large-scale areas with high-resolution data, but those systems are sensitive to faulty or missing sensor data that degrade their accuracy. This paper aims to enhance water property estimation in coastal monitoring by employing machine learning techniques to address missing and faulty sensor data. The study focuses on Biscayne Bay, Florida, applying four machine learning models—linear regression, random forest, support vector regression (SVR), and multilayer perceptron (MLP)—to predict water parameters such as dissolved oxygen, pH, and temperature.
Data was collected with an Autonomous Underwater Vehicle (AUV), the YSI Ecomapper, over several missions that were carried out from September 2020 to November 2022. The preprocessed data were used for a dataset of different features-water parameters and geographic information-where different machine learning models were trained and tested. Models were compared in terms of their mean squared error, root mean square error, mean absolute error, and R-squared. Testings of different metrics were used for comparing predictive performance in every model.
From the results obtained with the models, ensembles of random forest and multilayer perceptron models were found to be quite effective, performing reasonably well compared to linear regression and support vector regression models in most instances, with the water parameters attaining high predictive accuracies for the target. Whereas random forest gave reliable results on all estimated parameters, this proves evidence of its capability to capture latent relationships in water quality features. This paper improves the estimation of water parameters so as to increase sensor reading reliability and further enhances the autonomy of marine robots to perform in situ monitoring with less human intervention.
The findings from this work have implications for better-informed decision-making in the management of coastal ecosystems, especially regarding persistent monitoring. Precise water quality monitoring is also of great importance in coastal environments that are increasingly being threatened by anthropogenic pressures coupled with climate change. Such machine learning-based data estimation models can be a complement to conventional sensor fusion techniques, providing robust solutions for data consistency and filling gaps where sensor data is missing or erroneous. It contributes to continuous environmental monitoring and promotes advanced machine learning techniques in the field of marine robotics for coastal area sustainability.
The full paper will be available to logged in and registered conference attendees once the conference starts on June 22, 2025, and to all visitors after the conference ends on June 25, 2025