Reading Group - RoboPI

Fall 2024	Underwater Imaging Papers
20	A computer model for underwater camera systems
P#20: Paper link Highlights: The computational model is designed to calculate underwater camera image components by leveraging inputs such as system geometry, source properties, and water optical properties. It outputs key irradiance components, including non-scattered light, scattered light, and backscatter, along with derived metrics like contrast and signal-to-noise ratio (SNR). This model facilitates comprehensive performance analysis and provides sample calculations to support its application.
19	A revised underwater image formation model
P#19: Paper link Highlights: The current underwater image models face a significant limitation in their assumption of similar coefficients for backscatter and direct transmission, which inaccurately represent real-world conditions. Key findings reveal that backscatter relies on distinct wavelength-dependent coefficients, a fact validated through oceanographic data. To address this, a revised model has been developed that accounts for these differences, significantly improving the accuracy of underwater images. This advancement resolves instabilities in color reconstruction and highlights the need for developing innovative methods to further enhance underwater imaging techniques.
18	Sea-thru: A method for removing water from underwater images
P#18: Paper link Highlights: Color recovery in underwater images is challenged by the limitations of flawed atmospheric models. To overcome this, a revised model incorporates range-based attenuation and distinct backscatter coefficients, offering a more accurate representation of underwater light behavior. The Sea-thru method further enhances this approach by utilizing RGBD data to estimate backscatter and range-dependent attenuation. With results outperforming traditional atmospheric models on over 1,100 images from two different water bodies, this advancement significantly improves underwater image quality. The impact extends to enabling advanced computer vision and machine learning applications, driving progress in underwater exploration and conservation efforts.
17	3D Gaussian Splatting for Real-Time Radiance Field Rendering
P#17: Paper link Highlights: Radiance field methods struggle with inefficiencies in real-time 1080p rendering at high quality. To address this, a novel approach leverages 3D Gaussians, anisotropic optimization, and fast visibility-aware rendering techniques, enabling real-time performance. This solution achieves state-of-the-art quality while maintaining real-time rendering speeds of 30 frames per second or higher on established datasets, marking a significant advancement in rendering technology.
16	Single image haze removal using dark channel prior
P#16: Paper link Highlights: The proposed method introduces the dark channel prior, a simple yet effective image statistic, for single-image haze removal. The key observation underlying this approach is that most haze-free outdoor image patches contain pixels with low intensity in at least one color channel. By combining this prior with the haze imaging model, the method estimates haze thickness and recovers clear images. The results demonstrate high-quality haze removal while also generating accurate depth maps as a valuable byproduct, showcasing the efficacy of this approach.
15	SeaThru-NeRF: Neural radiance fields in scattering media
P#15: Paper link Highlights: Traditional NeRF models fail to account for the effects of scattering media in underwater or foggy scenes, limiting their applicability in such environments. To address this, a novel NeRF rendering model is developed using the SeaThru image formation framework. This approach successfully removes medium effects, revealing clear object appearances and accurate depths while achieving photorealistic views, significantly enhancing the realism and utility of NeRF in challenging visual environments.

Summer 2024	Subsea Telerobotics and HMI Papers
14	UUV Simulator: A Gazebo-based Package for Underwater Intervention and Multi-Robot Simulation
P#14: Paper link (Oceans 2016) A widely used open-source marine robotics simulator. Built upon OpenSceneGraph engine, ROS and Gazebo plugin available. An extensive collection of sensors: camera, IMU, magnetometer, DVL, MBES, sonar. Supports multiple ROVs: RexROV2, ECA A9, LAUV, Desistek SAGA ROV.
13	OceanPlan: Hierarchical Planning and Replanning for Natural Language AUV Piloting in Large-scale Unexplored Ocean Environments
P#13: Paper link (ArXiv) Highlights: LLM-based hierarchical motion planner for AUVs. First translates abstract human instructions to robotic tasks, then plans motion. A replanner to sense environmental uncertainties and adjust motion in real-time . Tested in HoloEco (built upon HoloOcean) simulator.
12	The PANDORA project: a success story in AUV autonomy
P#12: Paper link (Oceans 2016) Highlights: PANDORA (Persistent Autonomy Through Learning Adaptation Observation and Re-planning): a framework for autonomous underwater inspection and intervention. The AUV learns to inspect and manipulate from operator's demonstration. Adapts the knowledge for valve control in a different scenario. Tested with Girona 500 AUV in laboratory testbed. Follows and inspects a chain using acoustic and visual perception. Tested with Nessie AUV.
11	A Shared Autonomy System for Precise and Efficient Remote Underwater Manipulation
P#11: Paper link (IEEE T-RO, July 2024) Highlights: SHARC (Shared autonomy for remote collaboration): a framework for tele-manipulation with hand gesture and natural language. SHARC-VR: Real-time pose update of the manipulator shown in a 3D reconstructed scene to improve spatial awareness of operators. A shared autonomy framework where the robot autonomously pick up a tool upon operator command.
10	Sensory augmentation for subsea robot teleoperation
P#10: Paper link Highlights: Enhanced telepresence with VR and haptic sensation to improve ROV teleoperation. Hierarchical feedback for (i) near-field (vibration), and (ii) far-field (flow) hydrodynamic sensation. Virtual scene rendering and digital twin modeling from real deployment data.

2023	ViCom - Visual Computing Papers	Presenter
09	Quanta burst photography	Brevin Tilmon
P#9: 04/13/2023 Paper link (ACM ToG) This paper theoretically analyzes the SNR and dynamic range of quanta burst photography, and identifies the imaging regimes where it provides significant benefits. The authors demonstrate, via a recently developed SPAD array, that the proposed method is able to generate high-quality images for scenes with challenging lighting, complex geometries, high dynamic range and moving objects. With the ongoing development of SPAD arrays, the quanta burst photography has the potential to find applications in both consumer and scientific photography.
08	Dual-Shutter Optical Vibration Sensing	Hannah Kirkland
P#8: 04/06/2023 Paper link (CVPR 2022) In this paper, a novel method for sensing vibrations at high speeds (up to 63kHz) is proposed, for multiple scene sources at once, using sensors rated for only 130Hz operation. The method relies on simultaneously capturing the scene with two cameras equipped with rolling and global shutter sensors, respectively. The rolling shutter camera captures distorted speckle images that encode the highspeed object vibrations. The global shutter camera captures undistorted reference images of the speckle pattern, helping to decode the source vibrations. The authors demonstrate their method by capturing vibration caused by audio sources (e.g. speakers, human voice, musical instruments) and analyzing the vibration modes of a tuning fork.
07	Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement	Adnan Abdullah
P#7: 03/30/2023 Paper link (CVPR 2020) The authors propose the first low-light enhancement network that is independent of paired and unpaired training data, thus avoiding the risk of overfitting. As a result, this method generalizes well to various lighting conditions. They design an image-specific curve that is able to approximate pixel-wise and higher-order curves by iteratively applying itself. They also show the potential of training a deep image enhancement model in the absence of reference images through task-specific non-reference loss functions.
06	Toward Fast, Flexible, and Robust Low-Light Image Enhancement	Boxiao Yu
P#6: 03/30/2023 Paper link (CVPR 2022) The authors develop a new Self-Calibrated Illumination (SCI) learning framework for fast, flexible, and robust brightening images in real-world low-light scenarios. They define the unsupervised training loss to constrain the output of each stage under the effects of selfcalibrated module, endowing the adaptation ability towards diverse scenes. In nutshell, SCI redefines the peak-point in visual quality, computational efficiency, and performance on downstream tasks.
05	SVIn2: A multi-sensor fusion-based underwater SLAM system	Adnan Abdullah
P#5: 03/09/2023 Paper link (IROS 2019) This paper presents a novel tightly-coupled keyframe-based Simultaneous Localization and Mapping (SLAM) system with loop-closing and relocalization capabilities targeted for the underwater domain. This system is easily adaptable for different sensor configuration: acoustic (mechanical scanning profiling sonar), visual (stereo camera), inertial (linear accelerations and angular velocities), and depth data which makes the system versatile and applicable on-board of different sensor suites and underwater vehicles.
04	Vision Transformer with Deformable Attention	Jackson Arnold
P#4: 02/16/2023 Paper link (CVPR 2022) This paper proposes a novel deformable self-attention module, where the positions of key and value pairs in self-attention are selected in a data-dependent way. This flexible scheme enables the self-attention module to focus on relevant regions and capture more informative features. On this basis, the authors present Deformable Attention Transformer, a general backbone model with deformable attention for both image classification and dense prediction tasks.
03	Holocurtains: Programming Light Curtains via Binary Holography	Brevin Tilmon
P#3: 02/09/2023 Paper link (CVPR 2022) In this work, the authors propose Holocurtains: a light-efficient approach to producing light curtains of arbitrary shape. The key idea is to synchronize a rolling-shutter camera with a 2D holographic projector, which steers (rather than block) light to generate bright structured light patterns. their prototype projector uses a binary digital micromirror device (DMD) to generate the holographic interference patterns at high speeds. This system produces 3D light curtains that cannot be achieved with traditional light curtain setups and thus enables all-new applications, including the ability to simultaneously capture multiple light curtains in a single frame, detect subtle changes in scene geometry, and transform any 3D surface into an optical touch interface.
02	Adiabatic Quantum Computing for Multi Object Tracking	Hannah Kirkland
P#2: 02/02/2023 Paper link (CVPR 2022) In this work, the authors propose the first Multi-Object Tracking (MOT) formulation designed to be solved with AQC. They employ an Ising model that represents the quantum mechanical system implemented on the AQC. They show that their approach is competitive compared with state-of-the-art optimization based approaches, even when using of-the-shelf integer programming solvers. Finally, they demonstrate that this MOT problem is already solvable on the current generation of real quantum computers for small examples, and analyze the properties of the measured solutions.
01	Denoising Diffusion Probabilistic Models	Boxiao Yu
P#1: 01/26/2023 Paper link (NeurIPS 2020) The authors present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. The best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and their models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.