MediaPipe

This lab note is meant to be a brief documentation on my experiments with TFLite and MediaPipe. I'll go over the following:

  • Overview of my ML training set-up
  • MediaPipe overview
  • Custom MediaPipe calculator

ML training set-ups:

A couple of years ago, I heavily relied on AWS EC2 Spot instances (g2.2xlarge) with EBS storage for the training assets, and the model weights. Unfortunately the cost of training was quickly going up, and back then Google Colab wasn't quite a thing (nor any of Google's TPU offerings).

After running some continuous experiments on the cloud, I reached the point where building my own machine would be more cost effective – so I got myself a GTX 1080 Ti, and built my own ML rig.

I ssh into my machine, then use tmux to set-up some persistent sessions. Within a tmux session, I launch my jupyter server with jupyter notebook --no-browser --port=8888 --ip=0.0.0.0. The --ip is set to 0.0.0.0 so I can check on my notebooks while on the go.

An alternative to using the --ip option stated above, is to use ssh port forwarding: ssh -NL 8888:localhost:8888 username@64.233.160.0, which will map your local 8888 port to the server's 8888 port. You can then access https://localhost:8888 on from your host.

MediaPipe overview:

MediaPipe is a graph based cross platform framework, that facilitates building a pipeline for your ML apps. I was sold on the graph based concept, and overall seemed like an interesting tool to experiment with.

Note: You can use TFLite without MediaPipe but there's more to just adding a TFLite model to building an ML based mobile app.

While the concept is great, MediaPipe has quite the steep learning curve. You need to understand Bazel, write C++, learn the MediaPipe specific concepts such as Calculators etc.

I spent quite some time trying to set-up my project. I initially wanted to treat it like a third-party dependency, but I quickly ran into problems when trying to write my own custom Calculators, or extend their FrameProcessor.java file.

In the end, I set-up my project to have two remote end-points. origin points to my github, while upstream points to google/mediapipe, and I occasionally run git rebase upstream/master to rebase my repo atop MediaPipe.

Note: To reference MediaPipe from within your bazel build files see here.

This following one-liner compiles all your calculator, bundles your .tflite model, streams the final .apk onto your phone, and launches the MainActivity.

bazel build -c opt --config=android_arm64 //mediapipe/examples/android/src/java/com/google/mediapipe/apps/basic:helloworld --sandbox_debug && adb install bazel-bin/mediapipe/examples/android/src/java/com/google/mediapipe/apps/basic/helloworld.apk && adb shell am start -n com.google.mediapipe.apps.basic/com.google.mediapipe.apps.basic.MainActivity && sleep 2 && adb logcat --pid=`adb shell pidof -s com.google.mediapipe.apps.basic`

Custom MediaPipe calculator

I've also gone ahead and created a new calculator that converts std::vector <GlBuffer> into mediapipe::GpuBuffer. This is great for models that spit out an image. I learnt about Compute Shaders – something that hasn't quite made its way to WebGL!