Blurred LiDAR for Sharper 3D

Robust Handheld 3D Scanning with Diffuse LiDAR and RGB

Massachusetts Institute of Technology

Abstract


3D surface reconstruction is essential across applications of virtual reality, robotics, and mobile scanning. However, RGB-based reconstruction often fails in low-texture, low-light, and low-albedo scenes. Handheld LiDARs, now common on mobile devices, aim to address these challenges by capturing depth information from time-of-flight measurements of a coarse grid of projected dots. Yet, these sparse LiDARs struggle with scene coverage on limited input views, leaving large gaps in depth information. In this work, we propose using an alternative class of "blurred" LiDAR that emits a diffuse flash, greatly improving scene coverage but introducing spatial ambiguity from mixed time-of-flight measurements across a wide field of view. To handle these ambiguities, we propose leveraging the complementary strengths of diffuse LiDAR with RGB. We introduce a Gaussian surfel-based rendering framework with a scene-adaptive loss function that dynamically balances RGB and diffuse LiDAR signals. We demonstrate that, surprisingly, diffuse LiDAR can outperform traditional sparse LiDAR, enabling robust 3D scanning with accurate color and geometry estimation in challenging environments.

Method


(a) Sparse (conventional) LiDAR vs Diffuse LiDAR. Sparse LiDAR projects a grid of points which enable precise timing returns corresponding to individual depths; alternatively, diffuse LiDAR projects a diffuse flash illumination and measures the returns over a wide per-pixel instantaneous field-of-view (IFOV), increasing spatial coverage but also ambiguity in inferred depth. (b) Diffuse LiDAR and RGB have complementary strengths; RGB provides dense spatial and color information, while diffuse LiDAR provides coarse, metric depth even under challenging scenarios.

We consider a compact hardware setting with co-located RGB camera and diffuse LiDAR. At each view, we capture an (a) RGB image and coarse (8 × 8) histograms, for which each pixel contains mixed signal from a wide IFOV ω. We perform (b) analysis-by-synthesis reconstruction using Gaussian surfels, sampling rays within each pixel IFOV and rendering transients with alpha-weighted differentiable binning. Loss signal from RGB and transient inputs are balanced dynamically with a scene-adaptive loss producing (c) high-fidelity RGB and depth/normals for accurate mesh reconstruction.

Results


We observe improved recoverability with diffuse LiDAR when input views are limited. Full rank is 900 in our analysis simulation. Diffuse LiDAR has greater voxel coverage than conventional sparse LiDAR; this greater coverage can improve rank, and thereby recoverability, when using a limited number of input views. We consider this specific limited-view domain in this work. As the number of input views increases, sparse LiDAR can eventually provide sufficient coverage for scene recoverability.

We enable (a) accurate RGB novel-view synthesis on full-texture scenes, where RGB may be prioritized by our adaptive loss. When (b) textured objects are on textureless planes, we enable greater object-plane separation and peripheral region robustness; on textureless objects, we (c) enhance geometry estimation within the object convex hull visible from RGB. We also (d) improve over sparse LiDAR in scenes without RGB signal, where our adaptive loss may weight diffuse LiDAR more heavily. We enable (e) accurate color-mesh estimation across a wide range of texture and object variations.

We simulate low lighting with added Gaussian noise; our scene-adaptive loss weighting can be used to to rely on diffuse LiDAR inputs more heavily as RGB input SNR increases, enabling robust depth estimation across a wide scale of low lighting noise.

We improve mesh reconstruction in challenging real-world scenes on few (90) inputs. RGB with sparse LiDAR fails to separate object and plane due to low albedo and poor spatial coverage.