- Published on
NVIDIA "Supersizes" Humanoid Control: SONIC Open-Sources Whole-Body Tracking at Scale

Yesterday, NVIDIA open-sourced SONIC (Supersizing Motion Tracking for Natural Humanoid Whole-Body Control). Today, NVIDIA has continued its "Physical AI" push with the release of the DreamDojo world model. While DreamDojo focuses on generative simulation, SONIC serves as the foundational "brain" for movement, aiming to break the historical lag in humanoid control by establishing motion tracking as a scalable foundation task for robotics.
Scaling Beyond Manual Rewards
The fundamental bottleneck in humanoid robotics has long been "task selection". Traditional reinforcement learning requires researchers to manually engineer complex reward terms for every individual behavior—one set for walking, another for dancing, and a third for getting up.
NVIDIA’s researchers argue that motion tracking is a more natural scaling target because it leverages decades of existing human motion-capture (mocap) research. By supersizing this approach, the team trained a generalist controller using:
- 100 million frames of diverse motion data (over 700 hours).
- 42 million parameters, a significant increase over the few million common in current controllers.
- 9,000 GPU hours of compute to achieve universal tracking capabilities.
This massive dataset allows the model to acquire "human motion priors" without the need for manual reward tuning for every new skill.
Stay Ahead in Humanoid Robotics
Get the latest developments, breakthroughs, and insights in humanoid robotics — delivered straight to your inbox.
The Universal Token Space
A core innovation of SONIC is its universal token space. Unlike previous systems that require specialized retargeting for different inputs—such as Amazon’s OmniRetarget engine —SONIC uses a unified encoder-decoder architecture that handles heterogeneous input modalities.
This allows a single policy to handle:
- VR Teleoperation: Supporting full-body control via PICO headsets and trackers.
- Video-to-Motion: Estimating human motion from monocular webcams at over 60 FPS.
- Multi-Modal Commands: Executing zero-shot instructions from text prompts (e.g., "dance like a monkey") or rhythmic music audio.
By mapping these diverse sources into a shared latent representation, SONIC enables "cross-embodiment" transfer, allowing a robot like the Unitree G1 to imitate human movements despite morphological differences.
"System 1" for Generalist Robots
The release positions SONIC as a robust "System 1" controller—the fast, reactive layer of the robotic brain responsible for whole-body skills. This distinguishes it from "System 2" reasoning models like Figure’s Helix 02, which handle slower, high-level planning.
To bridge the gap between reactive control and high-level intent, NVIDIA developed a real-time kinematic motion planner. This planner can regenerate future motions in under 5 ms on a standard laptop, allowing users to guide the robot through navigation, boxing, or postural modes like crawling and kneeling without retraining the underlying policy.
NVIDIA also demonstrated that SONIC is compatible with foundation-model planning. The team fine-tuned a GROOT N1.5 vision-language-action (VLA) model to emit teleoperation-format commands, which SONIC then executed to achieve a 95% success rate on a mobile pick-and-place task.
Open-Sourcing the "Physical AI" Stack
In a move mirroring their DreamDojo release, NVIDIA has made the SONIC weights, inference code, and documentation public. Lead researcher Zhengyi "Zen" Luo confirmed that this will be a continuous update, with training code and further GROOT integration expected soon.
The research was validated on the Unitree G1, achieving a 100% success rate across 50 diverse real-world motion trajectories, including jumps and complex loco-manipulation. This successful zero-shot sim-to-real transfer mirrors recent breakthroughs from Amazon's PHP framework, suggesting that the "data gap" in robotics is finally being bridged by massive-scale simulation and motion-capture datasets.
The code and model are currently available on GitHub via the GR00T-WholeBodyControl repository.
Share this article
Stay Ahead in Humanoid Robotics
Get the latest developments, breakthroughs, and insights in humanoid robotics — delivered straight to your inbox.