AquaFuse: Waterbody Fusion for Physics Guided View Synthesis of Underwater Scenes

Md Abu Bakr Siddique, Jiayi Wu, Ioannis Rekleitis, and Md Jahidul Islam
IEEE Robotics and Automation Letters (RA-L), March 2025. [Impact Factor: 5.2]

AquaFuse is a physics-guided waterbody fusion method to fuse waterbody properties of a reference image to an input image. It leverages closed-form solutions to estimate and exploit the scene depth, background light, backscatter, and medium illumination parameters for the fusion process. AquaFused scenes are perceptually realistic and meet the underwater image formation constraints. Unlike data-driven style transfer methods, AquaFuse preserves the depth consistency and object geometry in an input scene.
We validate this unique feature by comprehensive experiments over diverse sets of underwater scenes. We demonstrate how AquaFuse can be used for waterbody crossover, image enhancement by fusing with a clear reference scene, and in a generative pipeline for data augmentation. For qualitative and quantitative validation, we collect data at multiple ocean sites over diverse waterbody types and depth levels (10'-80'). Our experiments show that AquaFuse:
• Performs perceptually realistic waterbody fusion.
• Ensures geometric consistency across underwater scenes.
• Is invariant to dataset bias and overfitting unlike learning-based approaches.
• Offers fast end-to-end computational being a closed-form solution.

Paper Pre-print Demo

Methodology

An outline of the computational pipeline of Aquafuse is shown on the right; it illustrates how AquaFuse transfers waterbody characteristics from one image to another. As shown in the top section, we start with an input image U1 and estimate parameters such as scene depth, backscatter signal, veiling light, and illumination. These allow us to recover a clean image I1 by adjusting for backscatter and light attenuation.

Moreover, shown in the bottom section, the same process is applied for reference image U2. For the fusion process, AquaFuse uses the veiling light from the reference image for backscatter fusion, which is then integrated to the input image U1, resulting in fused outputs U12. The backscatter fusion from U1 to U2 can also be done in the opposite direction, to generate U21. This process generates realistic underwater scenes by dusing waterbody properties from another (reference) scene while still maintaining depth, accuracy, and structural consistency.

Data Augmentation

The proposed waterbody fusion is a unique form of photometric transformation, beyond traditional perspective or isometric functions applied for altering brightness, color, contrast, etc. We validate this by using a few reference images with diverse hues, colors, and contrast variations. A few examples of 6x data augmentation by AquaFuse are shown in the figure below; we select six reference images (U2-U7), which are fused to input samples (Ua-Ud). As shown in each row, input images are fused to six reference images' waterbodies while preserving their original scene contents. The columns are arranged to show waterbody fusion to greenish (U2, U3, U4), blueish (U5, U6), and clear water (U7). A total of 6x4=24 AquaFused images are generated, demonstrating its utility and effectiveness.

In particular, we find that the close-up objects (in Ub, Uc) remain sharp, undistorted, and accurately blended with the surrounding waterbody. The dark shadow regions (in Uc) or bright foreground objects (in Ud) remain intact through the fusion process. Overall, AquaFuse effectively fuses the waterbody into a given image, generating diverse scenarios while preserving the fine object-level details in the scene. One can use more variety of reference images for comprehensive data augmentation of underwater image databases; since the object scene geometry remains preserved, standard annotations (for object detection, segmentation, etc.) would remain the same, thus facilitating a multi-fold increase in training samples.

Image Enhancement and Crossover

An useful feature of AquaFuse is that it can transfer clear water properties as well, which essentially results in clear fused images. As shown in the last column of the Figure above, with a clear high-resolution reference image (U7), all input images generated almost dewatered enhanced images. The transparent nature of the reference image is fused accurately, resulting in crisp and tint-free images. This fusion reduces the impact of the hazy and scattered veil in all input images, which validates our intuition of factorizing veiling light to drive the waterbody fusion process. This feature has significant use cases for applications requiring high-quality water-free images, especially in mapping coral reefs and other marine biological samples. Few other examples are shown on the right, which demonstrate that the AquaFused images are almost dewatered, and have significantly better image statistics in terms of color, contrast, and sharpness.

AquaFuse can be used as a style transfer method for blending waterbody appearance between two images. As shown in the Figure above, image pairs in the top row are used as reference images to each other to crossover their waterbodies. Although the objects in these scenes are entirely different (e.g., fish, human diver, coral reef, rocks) and at varied depths, foreground renderings of the fused waterbody are perceptually realistic and natural. These transformations simulate diverse underwater conditions, providing more useful data augmentation compared to data-driven methods. Traditional domain style transfer methods suffer from an averaging effect, and often generate distorted unrealistic images.

Geometry-preserving Scene Transformation

We also performed 3D reconstruction experiments using Gaussian Splatting on AquaFused images. The following figure illustrates AquaFuse's impact on 3D reconstruction quality. The first column shows the original 3D reconstruction results for point clouds, camera point views, and 3D Gaussian splats. The following four columns show the same reconstruction perspectives but with AquaFused scenes using different waterbody styles. These results demonstrate AquaFuse's ability to preserve underwater scenes' structural and geometric integrity in 3D, allowing for realistic and diverse underwater reconstructions. See the detailed video demonstration: here.

The visual inspections corroborate our quantitative analyses of the reconstructed scenes. The following table shows that the PSNR, SSIM, and Learned Perceptual Image Patch Similarity (LPIPS) scores of AquaFused images are within 1-2% of original scenes. AquaFuse slightly improves 3D reconstruction quality with a higher PSNR and lower LPIPS at shallow depths (10'-30'). The performance margins are slightly lower for deep-water scenes (over 60'); nevertheless, the scores are very similar to using the original images' reconstruction quality, which validates our intuition and design objectives.

3d results — Image similarity scores (PSNR, SSIM) between AquaFuse-generated images averaged across eight different waterbody fusions and raw input, and estimation error (RMSE) of their respective depth-maps are shown for data collected at different depth intervals.

Conclusion and Future Work

This project introduces AquaFuse, a physics-guided waterbody fusion method for accurate data augmentation, enhancement, and view synthesis of underwater imagery. AquaFuse generates realistic and geometrically consistent underwater images across various depths and waterbody conditions by exploiting the physical characteristics of light scattering, absorption, and backscatter. This makes it a powerful tool for augmenting training databases for underwater image recognition and view synthesis. Another key strength of AquaFuse is that it can handle complex waterbody properties such as varying color, turbidity, and clarity without compromising depth consistency or object geometry. Our experimental results demonstrate that AquaFuse can preserve over 94% depth consistency and 90-95% structural similarity in its waterbody fusion process. Moreover, AquaFuse generates accurate 3D reconstructions using techniques like Gaussian Splatting, making it an ideal tool for improving underwater scene diversity and realism. Our future efforts will focus on deploying AquaFuse in the autonomy pipeline of underwater robots for vision-based mapping and active servoing applications.

Acknowledgments

This work is supported in part by the NSF grants #2330416 and #1943205; and UF research grant #132763.