Last year I worked on a small arbitrary style transfer machine learning model that could run in real-time on my Pixel 2. I trained it in Tensorflow, built with MediaPipe.
Below is the same model, saved as
SavedModel, then converted with
Note: please be aware that this will load 5 ML models, which is about 35MB of data – face detection model, face landmark model, vgg encoder, encoder, and the AdaIN decoder.
Note: Looks like this isn't working on iOS. Might work with the tfjs wasm backend but that's for another time. Maybe.
The network architecture was built based on Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization. Both encoder and decoders were optimised to run on my Pixel 2 Android phone. The encoder for the content and the encoder for the style are different.
The COCO 2014 training dataset was used as the content, and the wikiart dataset for the styles.
TFLite and memory on my Pixel were the two biggest bottlenecks to get this to work with a decent enough result, at a high enough frame rate. Porting the models to work in Chrome was straightforward.