In a deep dive on the Robo Papers podcast, 1X Director of Evaluations Daniel Ho explains how "imagination" via video generation is allowing humanoids to perform zero-shot tasks with minimal robot-specific training data.
NVIDIA GEAR Lab has unveiled DreamZero, a 14-billion parameter World Action Model (WAM) that uses video diffusion to grant robots physical "imagination," enabling zero-shot task completion and rapid adaptation across different robotic embodiments.
Google DeepMind has launched Project Genie, an experimental tool powered by the Genie 3 world model that creates interactive 3D environments from text and images. Beyond gaming, the technology serves as a cornerstone of DeepMind’s strategy to solve the robotics data bottleneck and achieve physical AGI.
Generalist AI's chief scientist argues that the next leap in robotics won't come from internet text, but from scaling the 'reflexive' intelligence of physical interaction.
Meta’s former AI chief is taking his 'contrarian' bet to Paris, founding AMI Labs to build non-generative world models. Between a rumored $3.5 billion valuation and high-profile spats with Elon Musk, LeCun is betting that the path to true robotic intelligence doesn't start with language.