Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos
September 12, 2024 Β· Declared Dead Β· π ACM Transactions on Graphics
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yuheng Jiang, Zhehao Shen, Yu Hong, Chengcheng Guo, Yize Wu, Yingliang Zhang, Jingyi Yu, Lan Xu
arXiv ID
2409.08353
Category
cs.GR: Graphics
Cross-listed
cs.CV
Citations
36
Venue
ACM Transactions on Graphics
Last Checked
3 months ago
Abstract
Volumetric video represents a transformative advancement in visual media, enabling users to freely navigate immersive virtual experiences and narrowing the gap between digital and real worlds. However, the need for extensive manual intervention to stabilize mesh sequences and the generation of excessively large assets in existing workflows impedes broader adoption. In this paper, we present a novel Gaussian-based approach, dubbed \textit{DualGS}, for real-time and high-fidelity playback of complex human performance with excellent compression ratios. Our key idea in DualGS is to separately represent motion and appearance using the corresponding skin and joint Gaussians. Such an explicit disentanglement can significantly reduce motion redundancy and enhance temporal coherence. We begin by initializing the DualGS and anchoring skin Gaussians to joint Gaussians at the first frame. Subsequently, we employ a coarse-to-fine training strategy for frame-by-frame human performance modeling. It includes a coarse alignment phase for overall motion prediction as well as a fine-grained optimization for robust tracking and high-fidelity rendering. To integrate volumetric video seamlessly into VR environments, we efficiently compress motion using entropy encoding and appearance using codec compression coupled with a persistent codebook. Our approach achieves a compression ratio of up to 120 times, only requiring approximately 350KB of storage per frame. We demonstrate the efficacy of our representation through photo-realistic, free-view experiences on VR headsets, enabling users to immersively watch musicians in performance and feel the rhythm of the notes at the performers' fingertips.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Graphics
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Everybody Dance Now
R.I.P.
π»
Ghosted
Deep Bilateral Learning for Real-Time Image Enhancement
R.I.P.
π»
Ghosted
Animating Human Athletics
R.I.P.
π»
Ghosted
BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration
R.I.P.
π»
Ghosted
Shape Transformation Using Variational Implicit Functions
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted