Posted by Caren Chang – Developer Relations Engineer, Joanna (Qiong) Huang – Software program Engineer, and Chengji Yan – Software program Engineer
The most recent model of Gemini Nano, our strongest multi-modal on-device mannequin, simply launched on the Pixel 10 gadget sequence and is now accessible via the ML Equipment GenAI APIs. Combine capabilities comparable to summarization, proofreading, rewriting, and picture description immediately into your apps.
With GenAI APIs we’re centered on supplying you with entry to the newest model of Gemini Nano whereas offering constant high quality throughout gadgets and mannequin upgrades. Right here’s a sneak peak behind the scenes of a few of the issues we’ve carried out to realize this.
Adapting GenAI APIs for the newest Gemini Nano
We wish to make it as simple as doable so that you can construct AI powered options, utilizing probably the most highly effective fashions. To make sure GenAI APIs present constant high quality throughout totally different mannequin variations, we make many behind the scenes enhancements together with rigorous evals and adapter coaching.
- Analysis pipeline: For every supported language, we put together an analysis dataset. We then benchmark the evals via a mix of: LLM-based raters, statistical metrics and human raters.
- Adapter coaching: With outcomes from the analysis pipeline, we then decide if we have to prepare feature-specific LoRA adapters to be deployed on high of the Gemini Nano base mannequin. By delivery GenAI APIs with LoRA adapters, we guarantee every API meets our high quality bar whatever the model of Gemini Nano operating on a tool.
The most recent Gemini Nano efficiency
One space we’re enthusiastic about is how this up to date model of Gemini Nano pushes efficiency even greater, particularly the prefix pace – that’s how briskly the mannequin processes enter.
For instance, listed here are outcomes when operating text-to-text and image-to-text benchmarks on a Pixel 10 Professional.
Prefix Pace – Gemini nano-v2 on Pixel 9 Professional | Prefix Pace – Gemini nano-v2* on Pixel 10 Professional | Prefix Pace – Gemini nano-v3 on Pixel 10 Professional | |
Textual content-to-text | 510 tokens/second | 610 tokens/second | 940 tokens/second |
Picture-to-text | 510 tokens/second + 0.8 seconds for picture encoding | 610 tokens/second + 0.7 seconds for picture encoding | 940 tokens/second + 0.6 seconds for picture encoding |
The way forward for Gemini Nano with GenAI APIs
As we proceed to enhance the Gemini Nano mannequin, the crew is dedicated to utilizing the identical course of to make sure constant and top quality outcomes from GenAI APIs.
We hope it will considerably cut back the hassle to combine Gemini Nano in your Android apps whereas nonetheless permitting you to take full benefit of latest variations and their improved capabilites.
Study extra about GenAI APIs
Begin implementing GenAI APIs in your Android apps at the moment with steering from our official documentation and samples: GenAI API Catalog and ML Equipment GenAI APIs quickstart samples.