- Published on
Google DeepMind Unveils Gemini Robotics-ER 1.6: A Leap in Spatial Reasoning and Industrial Utility


AGIBOT AI Week: Solving the Physical AI Bottleneck
April 7–14 | A new technical reveal every weekday. From foundational datasets to integrated hardware, go inside the stack built for real-world impact.
MOUNTAIN VIEW, CA — On April 14, 2026, Google DeepMind announced the release of Gemini Robotics-ER 1.6, a significant upgrade to its specialized "Embodied Reasoning" framework. The new model, which follows the two-part brain architecture established in late 2025, introduces enhanced spatial reasoning, improved multi-view success detection, and a new "agentic vision" capability designed for high-precision industrial tasks.
Starting today, the model is available to developers via the Gemini API and Google AI Studio, signaling a rapid iteration cycle from the Gemini 1.5 Robotics framework released just months prior.
The Evolution of the "Strategic Planner"
In DeepMind’s robotics stack, the "ER" (Embodied Reasoning) model acts as the high-level strategic planner. While the Vision-Language-Action (VLA) model serves as the motor cortex, the ER model interprets complex, open-ended goals.
According to DeepMind researchers Laura Graesser and Peng Xu, Gemini Robotics-ER 1.6 shows substantial improvements over its predecessor, ER 1.5, and the base Gemini 3.0 Flash model. The model specializes in "bridging the gap between digital intelligence and physical action" by natively calling tools like Google Search or specialized VLAs to solve tasks.

Precision Pointing and Spatial Logic
A core focus of the 1.6 update is "pointing"—a fundamental skill for spatial reasoning. The model uses points to identify objects, map trajectories, and define "from-to" relationships. DeepMind claims the new model can now handle more complex constraints, such as identifying every object in a scene small enough to fit inside a specific container, with much higher accuracy than earlier versions.
Solving the "Last Millimeter" with Agentic Vision
Perhaps the most notable addition is the ability to perform instrument reading. Developed in close collaboration with Boston Dynamics, this feature allows robots like the all-electric Atlas or the quadruped Spot to interpret analog pressure gauges, thermometers, and digital readouts during facility inspections.
This is achieved through Agentic Vision, a process where the model:
- Zooms into high-resolution details of a gauge.
- Estimates proportions and intervals using code execution.
- Interprets context using world knowledge to determine if a reading indicates a safety hazard.
"Capabilities like instrument reading... will enable Spot to see, understand, and react to real-world challenges completely autonomously," said Marco da Silva, VP and GM of Spot at Boston Dynamics.

Bridging the Success Detection Gap
One of the persistent hurdles in Physical AI is "success detection"—the ability for a robot to know when a task is actually finished.
Gemini Robotics-ER 1.6 introduces advanced multi-view reasoning, allowing it to synthesize data from multiple camera streams, such as an overhead view and a wrist-mounted feed. This enables the model to confirm task completion even in occluded or poorly lit environments, a critical requirement for moving beyond scripted lab demos and into the "messy" reality of factories.
Safety and Physical Constraints
DeepMind characterizes 1.6 as its "safest robotics model yet." It reportedly shows a significantly improved capacity to adhere to physical safety constraints, such as refusing to pick up objects that exceed a gripper's weight limit or avoiding hazardous materials like liquids.
In tests based on real-life injury reports, the Robotics-ER models improved by 10% in identifying safety hazards in video scenarios compared to the standard Gemini 3.0 Flash. This focus on "alignment for embodied intelligence" mirrors recent efforts by competitors like Generalist AI to ensure that autonomous improvisations remain safe on the factory floor.
The "Android of Robotics" Strategy Continues
The launch of Gemini Robotics-ER 1.6 reinforces DeepMind's ambition to build a universal operating system for robots. By making the reasoning model available via API, DeepMind is positioning its software to be the "brain" for a diverse ecosystem of hardware, including the Agile ONE humanoid and the Apptronik Apollo.
As 2026 progresses, the industry’s focus is clearly shifting from simple motor skills to the high-level reasoning required for robots to navigate the "long tail" of real-world industrial problems.
Bonus: watch a clip of the previous version of Gemini Robotics-ER in action:
Comments
No comments yet. Be the first to share your thoughts!
Share this article
Stay Ahead in Humanoid Robotics
Get the latest developments, breakthroughs, and insights in humanoid robotics — delivered straight to your inbox.




