Physics-Aware 3D Scene Reconstruction with Gaussian Splatting

Object Decoupling • Training Methodology • Mobile XR Runtime

Gaussian Splatting yields photorealistic, real-time 3D scene representations, but the resulting splats are unaware: each Gaussian knows where it is and what color it emits — nothing about the object it belongs to or how that object should behave when the user reaches out to grab it. Move the chair, and the splats stay put. For mixed reality this is the wrong abstraction: presence depends on the world responding to the user, and a frozen scene is, in interactive terms, a flat scene.

Our research pushes Gaussian Splatting toward object-decoupled, physics-aware, interactive scenes for mobile XR. We treat the problem as having three coupled parts: a representation that splits a scene into a static background and individually addressable object splat sets; a training methodology that keeps that decoupled representation stable under fine-tuning; and a runtime that actually puts the result on a head, with rigid-body physics and controller interaction at headset frame rates. Together, these turn vanilla Gaussian Splatting from a passive photoreal viewer into a medium where users can pick up, move, and interact with the world.

The section below describes how we approach the problem; a short list of projects at the bottom of the page points to the specific work — published, in preparation, and ongoing — that makes up this research line.

How we approach it

01 — Representation

Object-decoupled Gaussian Splatting

Vanilla Gaussian Splatting captures how a scene looked, but the resulting splats have no notion of objects — every Gaussian belongs to one anonymous global point cloud. Our work treats object decoupling as a research problem in its own right: splitting Gaussian-Splatted scenes into a static background and a set of individually addressable object splat sets that can be transformed, re-lit, and physically simulated as if they were rigid bodies.

We pair this with a small but deliberate engineering choice — keeping each object's splats in their own group and exposing those groups as first-class entities to the renderer. That makes the rest of the stack possible: per-object physics, per-object fine-tuning, and predictive completion of surfaces no camera ever observed.

02 — Training

Optimization that holds up under decoupling

A Gaussian Splatting scene in which some objects are designated movable rigid bodies cannot be trained with the off-the-shelf optimizer. Object and background splats compete near boundaries; fine-tuning a localized region tends to degrade global PSNR; and the renderer alone provides no signal for distinguishing genuine surface coverage from floaters or holes. The lab's optimization work targets these failure modes with specialized strategies: alternating two-phase training that separates global structural updates from local rigid-body refinement, hull-based selective fine-tuning that restricts local updates to splats inside an object's geometric hull, and explicit object–background separation losses (repulsion and adoption regularization) that recover clean shared boundaries.

03 — Runtime

Mobile XR deployment

Most Gaussian Splatting research stops at desktop visualization. Putting decoupled splats on a head — with rigid-body physics, controller interaction, and headset frame rates on a mobile-class device — is a different engineering problem than rendering on a workstation GPU. Our runtime is built on the open-source aras-p Gaussian Splatting framework for Unity, extended with a native CUDA plugin architecture that exposes per-object GPU transforms to the engine's scene graph.

Physics integration runs through Meta Quest 3 / OpenXR; controller-triggered grab, drop, and throw all flow through Unity's existing rigid-body system. We also explore foveated rendering as a frame-budget lever — concentrating splat-shading effort where the user is actually looking and letting peripheral regions render at lower fidelity. The whole stack is built on actively-maintained open-source components (COLMAP, SAM2, LaMa, gsplat, aras-p), keeping the workflow reproducible and lettings each contribution slot back into the wider Gaussian Splatting ecosystem.

Figure 1 • From a passive splat scene to an interactive one Decoupling · training stability · in-headset interaction

Publications & Ongoing Work

The projects below report the specific work that makes up this research line.

2026
Annon
Under review
RigidGS: Object-Decoupled, Physics-Aware Gaussian Splatting on Meta Quest 3
A Unity-native runtime in which decoupled splat sets become first-class interactive objects, paired with a training methodology designed for the decoupled setting: alternating Phase A / Phase B optimization, hull-based selective fine-tuning, repulsion loss, and adoption regularization, with gsplat depth_var and final_T exposed as training signals for explicit floater and hole detection. Deployed on Quest 3 / OpenXR with per-object rigid-body physics and exploratory foveated rendering. Trained on a 3 × NVIDIA RTX A6000 cluster.

Ongoing
FlipGS: Predicting Bottom Gaussians for Unseen Contact Surfaces
Even after a scene has been decoupled into a static background and movable rigid bodies, one surface is almost never observed: the bottom, where the object meets the table or the floor. FlipGS is an ongoing project that addresses this gap. A PointNet → Transformer takes the visible Gaussians of an object and predicts the parameters of the bottom Gaussians directly — position, rotation, scale, opacity, and spherical-harmonic colour — avoiding a mesh round-trip. Training uses Objaverse as a large-scale prior. Architecture is up and running; large-scale training, quantitative evaluation, and integration with RigidGS so that a lifted object presents a believable underside the moment it leaves the table are in progress.