Physics-Aware 3D Scene Reconstruction with Gaussian Splatting

How we approach it

01 — Representation

Object-decoupled Gaussian Splatting

Vanilla Gaussian Splatting captures how a scene looked, but the resulting splats have no notion of objects — every Gaussian belongs to one anonymous global point cloud. Our work treats object decoupling as a research problem in its own right: splitting Gaussian-Splatted scenes into a static background and a set of individually addressable object splat sets that can be transformed, re-lit, and physically simulated as if they were rigid bodies.

We pair this with a small but deliberate engineering choice — keeping each object's splats in their own group and exposing those groups as first-class entities to the renderer. That makes the rest of the stack possible: per-object physics, per-object fine-tuning, and predictive completion of surfaces no camera ever observed.

Object Segmentation Object segmentation visualization

Point Cloud Labeling Point cloud labeling visualization

Object only rendered in Gaussian Scene Object-only rendering visualization

Background after object is removed Background-only rendering visualization

02 — Training

Optimization that holds up under decoupling

A Gaussian Splatting scene in which some objects are designated movable rigid bodies cannot be trained with the off-the-shelf optimizer. Object and background splats compete near boundaries; fine-tuning a localized region tends to degrade global PSNR; and the renderer alone provides no signal for distinguishing genuine surface coverage from floaters or holes. The lab's optimization work targets these failure modes with specialized strategies: alternating two-phase training that separates global structural updates from local rigid-body refinement, hull-based selective fine-tuning that restricts local updates to splats inside an object's geometric hull, and explicit object–background separation losses (repulsion and adoption regularization) that recover clean shared boundaries.

Object Removed with clean background boundaries Object segmentation visualization

Removed object with clean boundaries Point cloud labeling visualization

Object inpainted for smooth interaction in MR Object-only rendering visualization

Large Object Removed with clean background boundaries and inpainted

Removed object with clean boundaries Large object removed with clean object boundaries

03 — Runtime

Mobile XR deployment

Most Gaussian Splatting research stops at desktop visualization. Putting decoupled splats on a head — with rigid-body physics, controller interaction, and headset frame rates on a mobile-class device — is a different engineering problem than rendering on a workstation GPU. Our runtime is built on the open-source aras-p Gaussian Splatting framework for Unity, extended with a native CUDA plugin architecture that exposes per-object GPU transforms to the engine's scene graph.

Physics integration runs through Meta Quest 3 / OpenXR; controller-triggered grab, drop, and throw all flow through Unity's existing rigid-body system. We also explore foveated rendering as a frame-budget lever — concentrating splat-shading effort where the user is actually looking and letting peripheral regions render at lower fidelity. The whole stack is built on actively-maintained open-source components (COLMAP, SAM2, LaMa, gsplat, aras-p), keeping the workflow reproducible and lettings each contribution slot back into the wider Gaussian Splatting ecosystem.

Figure 1 • From a passive splat scene to an interactive one Decoupling · training stability · in-headset interaction

Grabbing object in MR Object segmentation visualization

Object falling under gravity Point cloud labeling visualization

Object interactable with other Game Objects Object interaction visualization

Publications & Ongoing Work

The projects below report the specific work that makes up this research line.

2026
Annon
Under review

RigidGS: Object-Decoupled, Physics-Aware Gaussian Splatting on Meta Quest 3

A Unity-native runtime in which decoupled splat sets become first-class interactive objects, paired with a training methodology designed for the decoupled setting: alternating Phase A / Phase B optimization, hull-based selective fine-tuning, repulsion loss, and adoption regularization, with gsplat depth_var and final_T exposed as training signals for explicit floater and hole detection. Deployed on Quest 3 / OpenXR with per-object rigid-body physics and exploratory foveated rendering. Trained on a 3 × NVIDIA RTX A6000 cluster.

Paper (TBA) Code (TBA) BibTeX

—
Ongoing

FlipGS: Predicting Bottom Gaussians for Unseen Contact Surfaces

Even after a scene has been decoupled into a static background and movable rigid bodies, one surface is almost never observed: the bottom, where the object meets the table or the floor. FlipGS is an ongoing project that addresses this gap. A PointNet → Transformer takes the visible Gaussians of an object and predicts the parameters of the bottom Gaussians directly — position, rotation, scale, opacity, and spherical-harmonic colour — avoiding a mesh round-trip. Training uses Objaverse as a large-scale prior. Architecture is up and running; large-scale training, quantitative evaluation, and integration with RigidGS so that a lifted object presents a believable underside the moment it leaves the table are in progress.

Project page (TBA)