Vicom - Visual Computing Reading Group


No. Spring-2023 Papers Presenter
09 Quanta burst photography Brevin Tilmon
P#9: 04/13/2023 Paper link (ACM ToG)

This paper theoretically analyzes the SNR and dynamic range of quanta burst photography, and identifies the imaging regimes where it provides significant benefits. The authors demonstrate, via a recently developed SPAD array, that the proposed method is able to generate high-quality images for scenes with challenging lighting, complex geometries, high dynamic range and moving objects. With the ongoing development of SPAD arrays, the quanta burst photography has the potential to find applications in both consumer and scientific photography.


08 Dual-Shutter Optical Vibration Sensing Hannah Kirkland
P#8: 04/06/2023 Paper link (CVPR 2022)

In this paper, a novel method for sensing vibrations at high speeds (up to 63kHz) is proposed, for multiple scene sources at once, using sensors rated for only 130Hz operation. The method relies on simultaneously capturing the scene with two cameras equipped with rolling and global shutter sensors, respectively. The rolling shutter camera captures distorted speckle images that encode the highspeed object vibrations. The global shutter camera captures undistorted reference images of the speckle pattern, helping to decode the source vibrations. The authors demonstrate their method by capturing vibration caused by audio sources (e.g. speakers, human voice, musical instruments) and analyzing the vibration modes of a tuning fork.


07 Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement Adnan Abdullah
P#7: 03/30/2023 Paper link (CVPR 2020)

The authors propose the first low-light enhancement network that is independent of paired and unpaired training data, thus avoiding the risk of overfitting. As a result, this method generalizes well to various lighting conditions. They design an image-specific curve that is able to approximate pixel-wise and higher-order curves by iteratively applying itself. They also show the potential of training a deep image enhancement model in the absence of reference images through task-specific non-reference loss functions.


06 Toward Fast, Flexible, and Robust Low-Light Image Enhancement Boxiao Yu
P#6: 03/30/2023 Paper link (CVPR 2022)

The authors develop a new Self-Calibrated Illumination (SCI) learning framework for fast, flexible, and robust brightening images in real-world low-light scenarios. They define the unsupervised training loss to constrain the output of each stage under the effects of selfcalibrated module, endowing the adaptation ability towards diverse scenes. In nutshell, SCI redefines the peak-point in visual quality, computational efficiency, and performance on downstream tasks.


05 SVIn2: A multi-sensor fusion-based underwater SLAM system Adnan Abdullah
P#5: 03/09/2023 Paper link (IROS 2019)

This paper presents a novel tightly-coupled keyframe-based Simultaneous Localization and Mapping (SLAM) system with loop-closing and relocalization capabilities targeted for the underwater domain. This system is easily adaptable for different sensor configuration: acoustic (mechanical scanning profiling sonar), visual (stereo camera), inertial (linear accelerations and angular velocities), and depth data which makes the system versatile and applicable on-board of different sensor suites and underwater vehicles.


04 Vision Transformer with Deformable Attention Jackson Arnold
P#4: 02/16/2023 Paper link (CVPR 2022)

This paper proposes a novel deformable self-attention module, where the positions of key and value pairs in self-attention are selected in a data-dependent way. This flexible scheme enables the self-attention module to focus on relevant regions and capture more informative features. On this basis, the authors present Deformable Attention Transformer, a general backbone model with deformable attention for both image classification and dense prediction tasks.


03 Holocurtains: Programming Light Curtains via Binary Holography Brevin Tilmon
P#3: 02/09/2023   Paper link (CVPR 2022)

In this work, the authors propose Holocurtains: a light-efficient approach to producing light curtains of arbitrary shape. The key idea is to synchronize a rolling-shutter camera with a 2D holographic projector, which steers (rather than block) light to generate bright structured light patterns. their prototype projector uses a binary digital micromirror device (DMD) to generate the holographic interference patterns at high speeds. This system produces 3D light curtains that cannot be achieved with traditional light curtain setups and thus enables all-new applications, including the ability to simultaneously capture multiple light curtains in a single frame, detect subtle changes in scene geometry, and transform any 3D surface into an optical touch interface.


02 Adiabatic Quantum Computing for Multi Object Tracking Hannah Kirkland
P#2: 02/02/2023 Paper link (CVPR 2022)

In this work, the authors propose the first Multi-Object Tracking (MOT) formulation designed to be solved with AQC. They employ an Ising model that represents the quantum mechanical system implemented on the AQC. They show that their approach is competitive compared with state-of-the-art optimization based approaches, even when using of-the-shelf integer programming solvers. Finally, they demonstrate that this MOT problem is already solvable on the current generation of real quantum computers for small examples, and analyze the properties of the measured solutions.


01 Denoising Diffusion Probabilistic Models Boxiao Yu
P#1: 01/26/2023   Paper link (NeurIPS 2020)

The authors present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. The best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and their models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.