ForestSplats

ForestSplats: Deformable transient field for Gaussian Splatting in the Wild

Authors: Anonymous

Teaser Image
Generated by AI

Overview


Given unconstrained images as input, ForestSplats effectively removes transient elements from 2D scenes and achieves fast rendering speed while preserving high-quality rendering. Our method decomposes the scene into the deformable transient field and the static field to disentangle static and transient elements without a Vision Foundation Model (VFM). Our method shows high-quality novel view synthesis from unconstrained image collections, effectively decoupling transient elements.

Abstract

Recently, 3D Gaussian Splatting (3DGS) has emerged showing real-time rendering speeds and high-quality results in static scenes. Although 3DGS show effective in static scenes, their performance significantly degrades in the real-world environments due to transient objects, lighting variations, and large levels of occlusion. To tackle this, existing methods estimate occluders or transient elements by leveraging pre-trained models or integrating additional transient field pipelines. However, these methods still suffer from two defects: 1) Utilizing semantic features from Vision Foundation model (VFM) cause additional computational cost or during optimization. 2) Transient field struggles to define clear boundaries for occluders due to lack of explicit transient modeling. To address these problem, we propose ForestSplats, an novel approach that leverages deformable transient field and hybrid masking strategy to effectively decompose static scenes from transient distractors. We designed the transient field to be deformable, capturing per-view transient elements. Furthermore, we introduce a hybrid masking strategy that refines the boundaries of occluders by leveraging photometric error maps and percentage of outlier. Additionally, we propose an uncertainty-aware densification to prevent unnecessary Gaussians from blurring the boundaries of occluders. Through extensive experiments across several benchmark datasets, we demonstrate ForestSplat not only outperforms state-of-the-art reconstruction quality, but also show efficiency for distractor-free novel view synthesis.

Video

We visualize ForestSplats on the Photo Tourism and NeRF On-the-go datasets. Our method produces high-quality 3D renderings and distractor-free novel view synthesis results.


Phototourism



NeRF On-the-go


Image

We provide qualitative results on the Photo Tourism and NeRF On-the-Go datasets. Our method addresses occluders in the scene using superpixel-aware masking and a multi-stage training strategy.




Interaction

We provide novel view synthesis for static and transient fields on the NeRF On-the-go and Phototourism datasets.

GT Composition Static Transient
GT Composition Static Transient
GT Composition Static Transient

Evaluation

Image 2

Ablation of the novel view synthesis for static and transient fields on the NeRF On-the-go and Photo Tourism datasets.

Image 1

Ablation of the number of transient Gaussians and the multi-stage training scheme. (a) Evaluation of results across different numbers of transient Gaussians. (b) Comparision of training PSNR across with and without multi-stage training scheme.

Image 3

Ablation of the number of superpixels on the Patio-high scene. N denotes the number of superpixels.

Image 3

Ablation of the transient Mask. (a) Comparison of transient mask methods. (b) Effect of the superpixel-aware mask.

Image 3

Ablation of the uncertainty-aware densification (UAD). We visualize centroids of static Gaussians with and without UAD.

Image 3

Ablation study of the qualitative results for each component in ForestSplats. More details are provided in the supplements.

Image 1

Ablation of computational efficiency. We report CPU Memory usage (MB) for initial the number of transient Gaussians.

Image 2

Ablation of the with and without Deformable for effect of transient fields on the Brandenburg gate and corner scenes.

Image 3

Ablation of the effectiveness of each component in ForestSplats. We report all quantitative results on the Sacre Coeur scene.

Conclusion

We introduce ForestSplats, a novel framework that leverages a deformable transient field and a superpixel-aware mask to effectively remove distractors from static fields. Furthermore, we introduce uncertainty-aware densi- fication to enhance rendering quality to avoid generate static gaus- sians on the boundaries of distractors during optimization. Extensive experiments on several datasets demonstrate that our method show competitive performance compared to existing methods without a pre-trained model and excessive memory usage.

This website is licensed under a CC BY-SA 4.0 and is based on the LOKI website.