Google DeepMind’s optimized AI model runs directly on robo...

Google DeepMind’s optimized AI model runs directly on robots

An image of an Apptronik robot putting a Rubik's Cube in a bag. One of Apptronik’s robots running the on-device model puts a Rubik’s Cube in a bag.

Google DeepMind is rolling out an on-device version of its Gemini Robotics AI model that allows it to operate without an internet connection. The vision-language-action model (VLA) comes with dexterous capabilities similar to the one released in March, but Google says “it’s small and efficient enough to run directly on a robot.”

The flagship Gemini Robotics model is designed to help robots complete a wide range of physical tasks, even if it hasn’t been specifically trained on them. It allows robots to generalize new situations and understand and respond to commands, as well as perform tasks that require fine motor skills.

Carolina Parada, head of robotics at Google DeepMind, tells The Verge that the original Gemini Robotics model uses a hybrid approach, allowing it to operate on-device and on the cloud. But with this device-only model, users can access offline features that are almost as good as those of the flagship.

The on-device model can perform several different tasks out of the box, and it can adapt to new situations “with as few as 50 to 100 demonstrations,” according to Parada. Google only trained the model on its ALOHA robot, but the company was able to adapt it to different robot types, such as the humanoid Apollo robot from Apptronik and the bi-arm Franka FR3 robot.

“The Gemini Robotics hybrid model is still more powerful, but we’re actually quite surprised at how strong this on-device model is,” Parada says. “I would think about it as a starter model or as a model for applications that just have poor connectivity.” It could also be useful for companies with strict security requirements.

Alongside this launch, Google is releasing a software development kit (SDK) for the on-device model that developers can use to evaluate and fine-tune it — a first for one of Google DeepMind’s VLAs.

The on-device Gemini Robotics model and its SDK will be available to a group of trusted testers while Google continues to work toward minimizing safety risks.

Leave a Comment
Categories
Archives