Active Authentication for Mixed Reality via Behavioral Biometrics

How we approach it

01 — Modalities

In-headset behavior as a biometric source

Rather than committing to a single sensor, we study a set of behavioral signals produced by ordinary MR use. Hand and controller kinematics capture individual differences in how a user types, reaches, grips, and pulls — signatures shaped by anthropometry and motor habit. Eye-movement dynamics expose individual oculomotor patterns through saccades, fixations, smooth pursuits, and scan paths. Head-pose trajectories from the headset's own 6-DoF tracking reveal habitual postural micro-patterns during natural visual exploration.

Each modality maps naturally onto a different active-authentication moment: typing-time hand traces for entry-point gating, gaze and head dynamics for continuous in-session verification, and any of them for step-up checks before sensitive actions. The combination, not any single channel, is what makes a deployable XR authentication stack.

User Typing Instance User typing in MR headset

User Interaction Instance User performing door-opening interaction in MR headset

User Tracjectories Recorded Traces of hand and head movement during user interactions in MR headset

Three behavioral signals captured by ordinary in-headset use. Each carries individual signatures stable enough to support active authentication; together they cover the entry, continuous, and step-up moments of an MR session.

02 — Methodology

Domain-informed models and cross-session evaluation

Behavioral signals in MR are not generic time series. Eye movements obey characterized oculomotor physiology; head pose is constrained by neck biomechanics; hand kinematics by motor habit. Our modeling work foregrounds that domain structure rather than discarding it — specialized feature-extraction modules built around oculomotor event classes for eye-movement biometrics, task-specific extractors for head and hand streams, and similarity-learning architectures (Siamese networks) for verifying typing-time hand traces.

Evaluation is held to a higher bar than the field's usual within-session classification, which tends to overstate what a deployed system would achieve. We use cross-session protocols across days and weeks, and longitudinal analyses reaching into the multi-month range — including a 269-day window for controller-based door-opening behavior and a 37-month slice of the GazeBase eye-movement corpus. The recurring methodological question is feature temporal stability: which signatures actually remain individuating as the user's biology and habits drift?

Example of Domain-informed modeled behavior Example of architecture with less parameters

03 — Deployment

Architectures and protocols a headset can actually run

Active authentication has to live inside a mobile XR device's compute budget. We favor compact architectures over heavy sequence models: a small feed-forward 1D CNN on head pose, a parameter-efficient design for eye-movement biometrics that matches state-of-the-art accuracy with fewer parameters, and a Siamese-similarity model for hand typing that operates at practical FAR/FRR points. None of these requires workstation-class inference.

We also study practical deployment questions that get little attention but matter for shipping: how many enrollment tasks are needed before adding more stops paying off? How does authentication accuracy degrade across months, and is the curve linear or non-linear? Which task subsets generalize best to unseen activities? The output we want is not just a low EER on a paper, but guidelines that XR vendors could actually use to integrate behavioral biometrics into a real product.

Example of Less Parameterized Deep Learning Architecture Example of architecture with less parameters

ROC Curves of Domain-Informed Eye Movement Biometrics ROC curves example

EER Plots of Head Movement Biometrics with generic tasks EER plots example

Publications

The papers below report the individual studies that make up this research line. Each links to the full text where available.

2025
IEEE WIFS

DIEMB: Domain-Informed Eye Movement Biometrics

Specialized feature-extraction modules built around oculomotor event classes — saccades, fixations, smooth pursuits, and scan paths. On a 37-month longitudinal slice of GazeBase, DIEMB matches the state-of-the-art EKYT at the user level (15.2% vs. 16.2% average EER) using 6% fewer parameters, and identifies 2–3 structured tasks as sufficient to match full-task enrollment performance.

Paper Code (TBA)

2025
ACM VRST

Head Movement Biometrics for Continuous Authentication in Virtual Reality

A compact 1D CNN over windowed 6-DoF head-pose streams with task-specific extractors, designed to fit a headset's compute budget. The model reaches a best 2.9% EER, competitive with much heavier sequence baselines, supporting head movement as a viable continuous-authentication signal alongside gaze and gesture.

Paper Code (TBA)

2024
IEEE FG

HM-Auth: Hand-Movement Biometrics for VR Text Entry

A Siamese-network similarity model over hand-controller trajectories produced while a user types a predefined phrase on a VR keyboard. Across 30 participants, a symmetric-rejection method achieves a False Acceptance Rate of 0.08 at a False Reject Rate of 0 — positioning HM-Auth as a practical entry-point or step-up authentication option for immersive VR.

Paper Code (TBA)

2026
Anon
Under review

VR-Gate: Door-Opening Behavior as a Longitudinal VR Biometric

A 45-participant longitudinal study spanning sessions up to 269 days, capturing head pose and hand kinematics during door-opening tasks. Per-user classifiers are evaluated under increasingly large temporal gaps; the door-opening signature remains discriminative across substantial inter-session drift, supporting use as a low-friction continuous VR biometric.

Paper (TBA) Code (TBA) BibTeX (TBA)