Android Development

PyTorch machine studying fashions on Android

2 October 2024

Posted by Paul Ruiz – Senior Developer Relations Engineer

Earlier this 12 months we launched Google AI Edge, a set of instruments with easy accessibility to ready-to-use ML duties, frameworks that allow you to construct ML pipelines, and run standard LLMs and customized fashions – all on-device. For AI on Android Highlight Week, the Google crew is highlighting numerous ways in which Android builders can use machine studying to assist enhance their functions.

On this submit, we’ll dive into Google AI Edge Torch, which lets you convert PyTorch fashions to run domestically on Android and different platforms, utilizing the Google AI Edge LiteRT (previously TensorFlow Lite) and MediaPipe Duties libraries. For insights on different highly effective instruments, remember to discover the remainder of the AI on Android Highlight Week content material.

To get began with Google AI Edge simpler, we have offered samples obtainable on GitHub as an executable codelab. They show how one can convert the MobileViT mannequin for picture classification (appropriate with MediaPipe Duties) and the DIS mannequin for segmentation (appropriate with LiteRT).

a red Android figurine is shown next to a black and white silhouette of the same figure, labeled 'Original Image' and 'PT Mask' respectively, demonstrating image segmentation.

DIS mannequin output

This weblog guides you thru how one can use the MobileViT mannequin with MediaPipe Duties. Understand that the LiteRT runtime gives comparable capabilities, enabling you to construct customized pipelines and options.

Convert MobileViT mannequin for picture classification appropriate with MediaPipe Duties

As soon as you have put in the mandatory dependencies and utilities on your app, step one is to retrieve the PyTorch mannequin you want to convert, together with some other MobileViT parts you would possibly want (reminiscent of a picture processor for testing).

from transformers import MobileViTImageProcessor, MobileViTForImageClassification

hf_model_path="apple/mobilevit-small"
processor = MobileViTImageProcessor.from_pretrained(hf_model_path)
pt_model = MobileViTForImageClassification.from_pretrained(hf_model_path)

For the reason that finish results of this tutorial ought to work with MediaPipe Duties, take an additional step to match the anticipated enter and output shapes for picture classification to what’s utilized by the MediaPipe picture classification Job.

class HF2MP_ImageClassificationModelWrapper(nn.Module):

  def __init__(self, hf_image_classification_model, hf_processor):
    tremendous().__init__()
    self.mannequin = hf_image_classification_model
    if hf_processor.do_rescale:
      self.rescale_factor = hf_processor.rescale_factor
    else:
      self.rescale_factor = 1.0

  def ahead(self, picture: torch.Tensor):
    # BHWC -> BCHW.
    picture = picture.permute(0, 3, 1, 2)
    # RGB -> BGR.
    picture = picture.flip(dims=(1,))
    # Scale [0, 255] -> [0, 1].
    picture = picture * self.rescale_factor
    logits = self.mannequin(pixel_values=picture).logits  # [B, 1000] float32.
    # Softmax is required for MediaPipe classification mannequin.
    logits = torch.nn.practical.softmax(logits, dim=-1)

    return logits

hf_model_path="apple/mobilevit-small"
hf_mobile_vit_processor = MobileViTImageProcessor.from_pretrained(hf_model_path)
hf_mobile_vit_model = MobileViTForImageClassification.from_pretrained(hf_model_path)
wrapped_pt_model = HF2MP_ImageClassificationModelWrapper(
hf_mobile_vit_model, hf_mobile_vit_processor).eval()

Whether or not you propose to make use of the transformed MobileViT mannequin with MediaPipe Duties or LiteRT, the subsequent step is to transform the mannequin to the .tflite format.

First, match the enter form. On this instance, the enter form is 1, 256, 256, 3 for a 256×256 pixel three-channel RGB picture.

Then, name AI Edge Torch’s convert operate to finish the conversion course of.

import ai_edge_torch

sample_args = (torch.rand((1, 256, 256, 3)),)
edge_model = ai_edge_torch.convert(wrapped_pt_model, sample_args)

After changing the mannequin, you’ll be able to additional refine it by incorporating metadata for the picture classification labels. MediaPipe Duties will make the most of this metadata to show or return pertinent data after classification.

from mediapipe.duties.python.metadata.metadata_writers import image_classifier
from mediapipe.duties.python.metadata.metadata_writers import metadata_writer
from mediapipe.duties.python.imaginative and prescient.image_classifier import ImageClassifier
from pathlib import Path

flatbuffer_file = Path('hf_mobile_vit_mp_image_classification_raw.tflite')
edge_model.export(flatbuffer_file)
tflite_model_buffer = flatbuffer_file.read_bytes()

//Extract the picture classification labels from the HF fashions for later integration into the TFLite mannequin.
labels = record(hf_mobile_vit_model.config.id2label.values())

author = image_classifier.MetadataWriter.create(
    tflite_model_buffer,
    input_norm_mean=[0.0], #  Normalization is not wanted for this mannequin.
    input_norm_std=[1.0],
    labels=metadata_writer.Labels().add(labels),
)
tflite_model_buffer, _ = author.populate()

With all of that accomplished, it is time to combine your mannequin into an Android app. Should you’re following the official Colab pocket book, this entails saving the mannequin domestically. For an instance of picture classification with MediaPipe Duties, discover the GitHub repository. Yow will discover extra data within the official Google AI Edge documentation.

moving image of Newly converted ViT model with MediaPipe Tasks

Newly transformed ViT mannequin with MediaPipe Duties

After understanding how one can convert a easy picture classification mannequin, you should utilize the identical strategies to adapt numerous PyTorch fashions for Google AI Edge LiteRT or MediaPipe Duties tooling on Android.

For additional mannequin optimization, take into account strategies like quantizing throughout conversion. Try the GitHub instance to be taught extra about how one can convert a PyTorch picture segmentation mannequin to LiteRT and quantize it.

What’s Subsequent

To maintain updated on Google AI Edge developments, search for bulletins on the Google for Builders YouTube channel and weblog.

We stay up for listening to about the way you’re utilizing these options in your initiatives. Use #AndroidAI hashtag to share your suggestions or what you have inbuilt social media and take a look at different content material in AI on Android Highlight Week!

Convert MobileViT mannequin for picture classification appropriate with MediaPipe Duties

What’s Subsequent

LEAVE A REPLY Cancel reply