How we approach it
In-headset behavior as a biometric source
Rather than committing to a single sensor, we study a set of behavioral signals produced by ordinary MR use. Hand and controller kinematics capture individual differences in how a user types, reaches, grips, and pulls — signatures shaped by anthropometry and motor habit. Eye-movement dynamics expose individual oculomotor patterns through saccades, fixations, smooth pursuits, and scan paths. Head-pose trajectories from the headset's own 6-DoF tracking reveal habitual postural micro-patterns during natural visual exploration.
Each modality maps naturally onto a different active-authentication moment: typing-time hand traces for entry-point gating, gaze and head dynamics for continuous in-session verification, and any of them for step-up checks before sensitive actions. The combination, not any single channel, is what makes a deployable XR authentication stack.
Domain-informed models and cross-session evaluation
Behavioral signals in MR are not generic time series. Eye movements obey characterized oculomotor physiology; head pose is constrained by neck biomechanics; hand kinematics by motor habit. Our modeling work foregrounds that domain structure rather than discarding it — specialized feature-extraction modules built around oculomotor event classes for eye-movement biometrics, task-specific extractors for head and hand streams, and similarity-learning architectures (Siamese networks) for verifying typing-time hand traces.
Evaluation is held to a higher bar than the field's usual within-session classification, which tends to overstate what a deployed system would achieve. We use cross-session protocols across days and weeks, and longitudinal analyses reaching into the multi-month range — including a 269-day window for controller-based door-opening behavior and a 37-month slice of the GazeBase eye-movement corpus. The recurring methodological question is feature temporal stability: which signatures actually remain individuating as the user's biology and habits drift?
Architectures and protocols a headset can actually run
Active authentication has to live inside a mobile XR device's compute budget. We favor compact architectures over heavy sequence models: a small feed-forward 1D CNN on head pose, a parameter-efficient design for eye-movement biometrics that matches state-of-the-art accuracy with fewer parameters, and a Siamese-similarity model for hand typing that operates at practical FAR/FRR points. None of these requires workstation-class inference.
We also study practical deployment questions that get little attention but matter for shipping: how many enrollment tasks are needed before adding more stops paying off? How does authentication accuracy degrade across months, and is the curve linear or non-linear? Which task subsets generalize best to unseen activities? The output we want is not just a low EER on a paper, but guidelines that XR vendors could actually use to integrate behavioral biometrics into a real product.