- Model capacity: What can it do?
- Technical characteristics
- How did it go?
- Developer supports: How to engage and use it?
Deepmind has launched a new generation of Gemini Robotics On-Device robot AI models that can ** run independently on the robot’s body** without relying on cloud-based computing resources. The models integrate visual, linguistic understanding and movement decision-making capabilities to perform highly intelligent and variable practical tasks. ** The core issues it addresses:**
-
** Reduced reliance on cloud computing** Reduced delay and increased response speed.
-
** Operation in an unstable network environment** improves reliability.
-
** Achieving universal operating capability and rapid adaptation to new tasks** Improving robotic interoperability.
** Gemini Robotics robotic model ** first launched in March 2025, based on the Gemini model, incorporating the operational reasoning of the physical world in combination with the capabilities of vision, language and motion (VLA).
Model capacity: What can it do?
Technical characteristics
- Optimization of local deployment
-
The model performed ** calculation of resource compression optimization** to enable it to operate on robotic equipment with a capacity limit.
-
GPU servers, CPUs or small AI chips are not required to support reasoning.
- Multimodel integration capability
-
Gemini 2.0 model structure based on DeepMind, which combines visual, linguistic and behavioural control.
-
Integration capability of ** image perception + command understanding + action execution **.
- Few-shot fine-tuning mechanism
- Support for the adaptation of new tasks under very few samples (50-100 demonstrations only), which significantly reduces the threshold for development and deployment.
How did it go?
In many tests, performance was better than that of existing models:
- ** Higher mission completion rate**: In particular, in missions or new environments not previously seen, the model demonstrates a stronger generalization capability.
- ** Directive follows:** More challenging extra-distributive tasks and complex multi-step directives are superior to other local alternatives.
-
** Response is faster**: benefits from local operations, no need to wait for cloud to return results.
- ** More stable implementation**: high levels of consistency can also be maintained on different robotic platforms.
** Examples of experimental missions:**
Fitness: not only running, but also crossing the platform
-
After training on the ALOHA platform, move to: Franka FR3 two-armed robot: Complete industrial-grade assembly tasks.
-
Apollo Emulator: Operation of natural language in family/service-type environments.
It is worth noting that such cross-platform migration** does not require re-training models** and requires minor adjustments to use the same intelligence capabilities.
Developer supports: How to engage and use it?
MuJoCo
Gemini Robotics SDK
Gemini Robots tech report
** Gemini Robotics On-Device marks the new phase of robot AI into “available”, “deploymentable” and “widening”.**
Its significance includes:
-
Persistence of peripheral intelligence**: robots can think independently, perform tasks and no longer rely on external servers.
-
Reduced deployment costs**: Adaptive + rapid fine-tuning Lower industry application thresholds.
-
** Uniform cross-hardware model structure**: a robotic device with models adapted to various forms can be achieved in the future.
Official presentation: https://deepmind.google/discover/blog/gemini-robotics-on-device-brings-ai-to-local-robotic-devices/