The Recorder app on Pixel sees a 24% increase in engagement with Gemini Nano-powered characteristic

0
38
The Recorder app on Pixel sees a 24% increase in engagement with Gemini Nano-powered characteristic



The Recorder app on Pixel sees a 24% increase in engagement with Gemini Nano-powered characteristic

Posted by Terence Zhang – Developer Relations Engineer and Kristi Bradford – Product Supervisor

The Recorder app on Pixel sees a 24% increase in engagement with Gemini Nano-powered characteristic

Google Pixel’s Recorder app permits individuals to file, transcribe, save, and share audio. To make it simpler for customers to handle and revisit their recordings, Recorder’s builders turned to Gemini Nano, a strong on-device massive language mannequin (LLM). This integration introduces an AI-powered audio summarization characteristic to assist customers extra simply discover the proper recordings and shortly grasp key factors.

Earlier this month, Gemini Nano acquired an influence increase with the introduction of the brand new Gemini Nano with Multimodality mannequin. The Recorder app is already leveraging this improve to summarize longer voice recordings, with improved processing for grammar and nuance.

Assembly person wants with on-device AI

Recorder builders initially experimented with a cloud-based resolution, attaining spectacular ranges of efficiency and high quality. Nevertheless, to prioritize accessibility and privateness for his or her customers, they sought an on-device resolution. The event of Gemini Nano offered an ideal alternative to construct the concise audio summaries customers have been on the lookout for, all whereas preserving knowledge processing on the system.

Gemini Nano is Google’s most effective mannequin for on-device duties. “Having the LLM on-device is useful to customers as a result of it gives them with extra privateness, much less latency, and it really works wherever they want since there’s no web required,” stated Kristi Bradford, the product supervisor for Pixel’s important apps.

To realize higher outcomes, Recorder additionally fine-tuned the mannequin utilizing knowledge that matches its use case. That is accomplished utilizing low order rank adaptation (LoRA), which allows Gemini Nano to persistently output three-bullet level descriptions of the transcript that embrace any speaker names, key takeaways, and themes.

AICore, an Android system service that centralizes runtime, supply, and demanding security parts for LLMs, considerably streamlined Recorder’s adoption of Gemini Nano. The provision of a developer SDK for working GenAI workloads allowed the crew to construct the transcription abstract characteristic in simply 4 months, with solely 4 builders. This effectivity was achieved by eliminating the necessity for sustaining in-house fashions.

Since its launch, Recorder customers have been utilizing the brand new AI-powered summarization characteristic averaging 2 to five occasions each day, and the variety of total saved recordings elevated by 24%. This characteristic has contributed to a major enhance in app engagement and person retention total. The Recorder crew additionally famous that suggestions concerning the new characteristic has been optimistic, with many customers citing the time the brand new AI-powered summarization characteristic saves them.

“We were surprised by how truly capable the model was… before and after LoRA tuning.” — Kristi Bradford, product manager for Pixel’s essential apps

The following massive evolution: Gemini Nano with multimodality

Recorder builders additionally carried out the newest Gemini Nano mannequin, often called Gemini Nano with multimodality, to additional enhance its summarization characteristic on Pixel 9 units. The brand new mannequin is considerably bigger than the earlier one on Pixel 8 units, and it’s extra succesful, correct, and scalable. The brand new mannequin additionally has expanded token assist that lets Recorder summarize for much longer transcripts than earlier than. Gemini Nano with multimodality is at present solely out there on Pixel 9 units.

Integrating Gemini Nano with multimodality required one other spherical of fine-tuning. Nevertheless, Recorder builders have been ready to make use of the unique Gemini Nano mannequin’s fine-tuning dataset as a basis, streamlining the event course of.

To totally leverage the brand new mannequin’s capabilities, Recorder builders expanded their dataset with assist for longer voice recordings, carried out refined analysis strategies, and established launch standards metrics centered on grammar and nuance. The inclusion of grammar as a brand new metric for assessing inference high quality was made doable solely by the improved capabilities of Gemini Nano with Multimodality.

UI example

Doing extra with on-device AI

“Given the novelty of GenAI, the entire crew had enjoyable studying how you can use it,” stated Kristi. “Now, we’re empowered to push the boundaries of what we will accomplish whereas assembly rising person wants and alternatives. It’s actually introduced a brand new stage of creativity to problem-solving and experimentation. We’ve already demoed not less than two extra GenAI options that assist individuals get time again internally for early suggestions, and we’re excited concerning the prospects forward.”

Get began

Be taught extra about how you can convey the advantages of on-device AI with Gemini Nano to your apps.

LEAVE A REPLY

Please enter your comment!
Please enter your name here