- Published on
The Last Millimeter: Physical Intelligence Unveils RL Tokens for Hyper-Fast Precision


Picking up a tool is a solved problem for modern foundation models, but using it with the precision of a skilled technician remains the industry's "last millimeter" hurdle. On March 19, 2026, Physical Intelligence (Pi) announced a potential solution: RL Tokens (RLT), a reinforcement learning method that allows robots to master high-precision tasks in minutes rather than days.
The update comes just weeks after the company's release of its Multi-Scale Embodied Memory and follows the successful deployment of its foundation model in industrial and domestic settings.
Bridging the VLA Gap
While Vision-Language-Action (VLA) models are excellent at broad competence—like cooking a grilled cheese sandwich—they often struggle with the fine-grained adjustments required for contact-rich tasks. Aligning a screwdriver with a tiny M3 screw or threading a zip tie requires sub-millimeter accuracy that broad pre-training rarely provides.
Pi’s RLT method tackles this by adding a specialized "RL token" output to the model. This token acts as a compressed information bottleneck, summarizing the VLA’s vast internal world-representation into a concise feature vector. This vector is then fed into a lightweight "actor-critic" network that can be trained on-device in real-time.
The results are significant:
- Rapid Adaptation: Robots can refine the most difficult stages of a task with as little as 15 minutes of real-world practice data.
- Speed Superiority: In ethernet cable insertion tests, the RLT-enhanced policy was not only faster than the base model but actually surpassed the median speed of human teleoperation.
- Efficiency: Because the RL tokens allow for a much smaller training architecture, the robot can perform hundreds of updates per second while practicing.
Moving Beyond Recap
This new method represents a surgical refinement of Pi’s previous work. While the company’s Recap algorithm focused on broad improvements and autonomous error recovery for long-horizon tasks, RLT is designed for "on-the-job" learning of specific, delicate skills.
Rather than retraining the entire multi-billion parameter model—a process that is computationally prohibitive—RLT allows the robot to "edit" its predicted actions. The system stays grounded in its prior VLA training but deviates when the real-time critic identifies a more efficient path to success.
The "End Game" of Real-World RL
The announcement drew a quick response from industry peers. Bernt Børnich, CEO of 1X Technologies, characterized real-world reinforcement learning as the "end game" for the sector. However, Børnich noted that scaling such a method requires safe, compliant hardware to prevent the robot from destroying itself or its environment during the "fail and adapt" phase of learning.
Pi appears to be betting that its software-first approach—fueled by a recent $600 million funding round—can solve these precision issues across various third-party hardware platforms. By focusing on the "RL token" as a modular interface, Pi is positioning itself as the primary "intelligence layer" for any robot chassis that needs to move past simple pick-and-place maneuvers.
What’s Next for Pi?
The company’s research paper demonstrates the RLT method on four key tasks: screwdriving, zip-tying, ethernet insertion, and power-cord plugging. While these are currently isolated skills, the roadmap involves integrating this fine-grained refinement into longer autonomous workflows, such as full electronic assembly or complex kitchen maintenance.
As robots move out of the lab and into "factory-ready" roles, the ability to learn directly from experience without human intervention will be the line between a novelty and a tool. With RLT, Pi is suggesting that the "dark matter" of robotic intuition—forces, friction, and fine-tuning—might finally be within reach of a software update.
Share this article
Stay Ahead in Humanoid Robotics
Get the latest developments, breakthroughs, and insights in humanoid robotics — delivered straight to your inbox.




