DeepMind unveils two "Robotics" models that boost robot intelligence

aivancity

7 months ago

What if robots could finally think like us? Google DeepMind has just unveiled two artificial intelligence models, dubbed RT-X and AutoRT, capable of giving robots a much more nuanced understanding of their environment.
These systems, developed through research in multimodal learning, mark a major step forward toward cognitive robotics, where machines no longer simply execute commands, but analyze, learn, and explain their actions.

A Double Breakthrough: RT-X and AutoRT

DeepMind isn't talking here about simple mechanical control models, but about true general-purpose intelligence architectures applied to robotics.

Features	RT-X	AutoRT
Model type	A Unified Model of Robotic Reasoning and Action	Multi-robot orchestration and autonomy system
Main objective	Understand and follow instructions in natural language	Plan, coordinate, and optimize the operations of multiple robots simultaneously
Apprenticeship	Based on data from more than 30 laboratories and 17 billion parameters	Continuous self-learning with autonomous feedback
Appetizers	Vision, text, verbal interaction	Data from multiple sensors, vision, and performance feedback
Key Skills	Contextual understanding, task transfer, explainable reasoning	Coordination, self-correction, robotic fleet management
Speed of adaptation	Up to 60% faster than DeepMind's previous systems	Real-time optimization powered by an automated scheduling engine
Areas of application	Domestic, industrial, and experimental robotics	Multi-agent environments: warehouses, laboratories, hospitals
Battery life	Language-based reasoning	Autonomous strategic control under human supervision

This demonstration illustrates RT-X's ability to interpret complex instructions such as "Sort the objects on the table by color" and to adjust its movements autonomously.

In this second video, DeepMind demonstrates how RT-X and AutoRT work together to manipulate objects, avoid obstacles, or coordinate multiple robots within the same workspace.

An AI capable of understanding and explaining its actions

DeepMind's models rely on a combination of computer vision, spatial reasoning, and natural language processing. Whereas older systems required specific training for each task, RT-X learns to generalize.

By combining images, descriptions, and verbal instructions, it becomes capable of developing a comprehensive action plan and justifying its choices. A robot can thus explain why it chooses a particular route or decides to move one object rather than another.

According to DeepMind, RT-X is based on a multimodal architecture with 17 billion parameters, capable of integrating visual and textual cues to understand the context of an action.

Robots that learn like children

What sets these models apart is their ability to learn from their mistakes.
AutoRT incorporates a self-evaluation mechanism that allows it to correct its actions and improve its performance without constant human supervision.

DeepMind researchers compare this behavior to a formof developmental learning, similar to that of a child discovering the world through trial and error.

This approach leads to robotics that is more autonomous, more adaptive, and capable of operating in unpredictable situations.

A revolution in global robotics

The impact of these models goes beyond mere technical performance.
DeepMind designed RT-X and AutoRT as open and collaborative systems: more than 30 international laboratories are participating in their development as part of the Open X-Embodiment project.

This initiative aims to create a shared knowledge base among robots, where every learning experience can be shared.
A robot trained in a Tokyo laboratory could thus instantly benefit from the experience of another robot based in Zurich.

According to DeepMind’s estimates, the combined use of AutoRT and RT-X could increase the speed at which robots adapt to complex environments such as warehouses, hospitals, or homes by 60% ¹.

Toward "thinking" robots

These advances mark a profound shift: the robots of the future will no longer be mere executors, but true reflective agents, capable of reasoning, planning, and explaining their decisions.

An RT-X robot might say, “I moved this object to avoid a collision, ” or “This surface seems unstable; I’m choosing a different foothold.”
This transparency, rare in robotics, marks a turning point toward explainable and responsible AI that is better integrated into human environments .

Ethical challenges and the role of human oversight

The increasing autonomy of robots raises questions:

How much leeway should we give them in making decisions?
Who is responsible if a mistake is made?
How can we prevent self-directed learning from going off track?

DeepMind emphasizes the need for constant human oversight and the development of a robust safety framework. The RT-X and AutoRT models can only operate within defined and validated contexts.
The company explicitly rules out any military or surveillance use.

However, several researchers are calling for the establishment of international regulations for cognitive robotics, in order to provide a framework for these emerging technologies and ensure their ethical use².

Toward a New Era in Cognitive Robotics

With RT-X and AutoRT, DeepMind is bringing artificial intelligence a step closer to human cognition. These models pave the way for truly adaptive robots capable of understanding language, interacting naturally, and learning from their environment.

This convergence of perception, language, and action could transform robotics over the next decade: from industry to healthcare, from logistics to space exploration, robots are becoming thinking partners.

Learn more

You can also read the article Gemini Gives Astronomers a New Eye: AI Detects the Mysteries of the Night Sky, which explores another application of general-purpose artificial intelligence in the scientific field.

References

1. DeepMind. (2025). Introducing RT-X and AutoRT: Toward General-Purpose Robots.
https://deepmind.google

2. European Robotics Forum. (2024). Ethics and Regulation of Cognitive Robotics.
https://roboticsforum.eu