DEV Community

Abhishek Dave for SSOJet

Posted on • Originally published at blog.ssojet.com

Google DeepMind Launches Gemini Robotics: Merging AI with the Physical World

Originally published at https://blog.ssojet.com/news-2025-03-google-gemini-robotics

Google DeepMind has unveiled Gemini Robotics, an advanced AI model leveraging the Gemini 2.0 framework to enhance robotics through the integration of vision, language, and action. This development is pivotal for creating robots that can adapt and respond to their environments more effectively.

Hands from the Robot’s POV. A pair of robotic hands move tiles into the word ‘world’ under the text ‘Gemini for the Physical’.

Image courtesy of Google DeepMind

Key Features of Gemini Robotics

Embodied Reasoning

A notable aspect of Gemini Robotics is its embodied reasoning capability, allowing robots to comprehend and react to their surroundings similar to humans. This feature is essential for tasks requiring quick adaptation in dynamic environments.

Humanoid Robotics Initiative

Google DeepMind is collaborating with Apptronik to develop the next generation of humanoid robots. This partnership aims at creating robots that can operate alongside humans in various settings, including homes and workplaces.

Safety and Ethical Considerations

With safety as a priority, Gemini Robotics incorporates collision avoidance and force limitation mechanisms. The ASIMOV dataset is utilized to enhance safety protocols, inspired by Isaac Asimov’s Three Laws of Robotics, ensuring robots act ethically and safely around humans.

Gemini Robotics’ Capabilities

Generality

Gemini Robotics is designed to generalize across various tasks, even those it has not encountered before. The model demonstrates superior performance on generalization benchmarks compared to existing state-of-the-art models.

Interactivity

The model's interaction capabilities allow it to process commands in natural language, adapting its behavior in real-time to instructions or changes in its environment. This “steerability” enhances collaboration between humans and robots in diverse settings.

If an object slips from its grasp, or someone moves an item around, Gemini Robotics quickly replans and carries on — a crucial ability for robots in the real world.

Image courtesy of Google DeepMind

Dexterity

Gemini Robotics excels in executing complex, multi-step tasks that require precise manipulation, such as origami folding or packing items. This level of dexterity is crucial for performing tasks that humans usually handle effortlessly.

Gemini Robotics-ER Model

Alongside Gemini Robotics, the Gemini Robotics-ER (Embodied Reasoning) model enhances spatial reasoning abilities, enabling roboticists to integrate their programs with Gemini's advanced capabilities. This model significantly improves performance in tasks requiring spatial understanding.

Gemini Robotics-ER excels at embodied reasoning capabilities including detecting objects and pointing at object parts.

Image courtesy of Google DeepMind

Safety and Ethical Frameworks

Google DeepMind emphasizes a holistic approach to safety, integrating low-level motor control with high-level semantic understanding. The development of the Robot Constitution framework aims to ensure that robots operate within defined ethical boundaries, promoting human safety.

Implications for Robotics Industry

The introduction of Gemini Robotics signifies a pivotal shift in robotics, particularly in how AI can enhance physical interactions. As highlighted by Kanishka Rao, director of robotics at DeepMind, the model addresses a core challenge in robotics: the failure to generalize in unfamiliar scenarios.

This leap forward, combining AI and robotics, aligns with the trend of integrating large language models into robotic systems, making them more adaptable and responsive to human commands. The Gemini framework is expected to enable a new generation of robots capable of performing tasks with minimal programming.

Gemini Robotics displays advanced levels of dexterity.

Image courtesy of Google DeepMind

Collaboration with Other Robotics Companies

Google DeepMind is collaborating with trusted testers, including Agile Robots and Boston Dynamics, to refine the capabilities of the Gemini Robotics-ER model. These partnerships are aimed at exploring applications of the technology in real-world scenarios.

Conclusion

To explore how advanced authentication solutions can enhance the security and user management in enterprise settings, consider SSOJet’s API-first platform. Implement secure SSO and user management with features such as directory sync, SAML, OIDC, and magic link authentication. Discover more at SSOJet.

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay