WildGaussians 3D Gaussian Splatting in the Wild

Jonas Kulhanek CTU in Prague,
ETH Zurich
Songyou Peng ETH Zurich
Zuzana Kukelova CTU in Prague
Marc Pollefeys ETH Zurich
Torsten Sattler CTU in Prague

WildGaussians boost 3DGS for in-the-wild scenes with appearance and dynamic changes

Abstract

We introduce WildGaussians, a novel approach to handle occlusions and appearance changes with 3DGS. By leveraging robust DINO features and integrating an appearance modeling module within 3DGS, our method achieves state-of-the-art results. We demonstrate that WildGaussians matches the real-time rendering speed of 3DGS while surpassing both 3DGS and NeRF baselines in handling in-the-wild data, all within a simple architectural framework.

WildGaussians overview
Left (appearance modeling): Per-Gaussian and per-image embeddings are passed as input to the appearance MLP which outputs the parameters of an affine transformation applied to the Gaussian's view-dependent color. Right (uncertainty modeling): An uncertainty estimate is obtained by a learned transformation of the GT image's DINO features. To train the uncertainty, we use the DINO cosine similarity (dashed lines).

Appearance modeling

In order to enable training on images with varying appearance (images captured at different time of the day), we extend 3DGS with appearance modeling module which achieves the same inference speed as 3DGS. In these visualizations, we interpolate between different training image embeddings to demonstrate how each method handles appearance changes. Note, we report FPS computed on NVIDIA 4090 at FullHD resolution (1920x1080).

3DGS
FPS: 7.7
WildGaussians
FPS: 9.6
K-Planes
FPS: 0.15
WildGaussians
FPS: 6.5
NeRF-W (reimpl.)
FPS: 0.004
WildGaussians
FPS: 10.6

Appearance interpolation

We show that our approach is able to smoothly interpolate between different appearances of the same scene.

Removing occluders

When there are occluders in the scene, the Gaussian splatting will not be able to represent the scene correctly leading to excessive ammounts of floaters. Our approach can remove these occluders by using DINO-based uncertainty predictor. Note, we report FPS computed on NVIDIA 4090 at FullHD resolution (1920x1080).

3DGS
FPS: 18.8
WildGaussians
FPS: 16.0
NeRF On-the-go
FPS: 0.05
WildGaussians
FPS: 12.6
3DGS
FPS: 7.5
WildGaussians
FPS: 12.6
NeRF On-the-go
FPS: 0.05
WildGaussians
FPS: 8.6

Depth prediction

For reference, we show the depth prediction rendered by rasterizing the Gaussians' centers.

RGBDepth

Concurrent works

There are several concurrent works that also aim to extend 3DGS to handle in-the-wild data:

Acknowledgements

We would like to thank Weining Ren for his help with the NeRF On-the-go dataset and code and Tobias Fischer and Xi Wang for fruitful discussions. This work was supported by the Czech Science Foundation (GAČR) EXPRO (grant no. 23-07973X) and by the Ministry of Education, Youth and Sports of the Czech Republic through the e-INFRA CZ (ID:90254). The renderer is built on 3DGS, Mip-Splatting. Please follow the license of 3DGS and Mip-Splatting. We thank all the authors for their great work and repos. Finally, we would also like to thank Dor Verbin for the video comparison tool used in this website.

Citation

Please use the following citation:
@article{kulhanek2024wildgaussians,
  title={{W}ild{G}aussians: {3D} Gaussian Splatting in the Wild},
  author={Kulhanek, Jonas and Peng, Songyou and Kukelova, Zuzana and Pollefeys, Marc and Sattler, Torsten},
  journal={arXiv},
  year={2024}
}