Essentially there are two ways to do this. The “old” way is to export your TensorFlow neural network into a protobuf file, then load up the TensorFlow interpreter in your iOS/Android app, feed it the neural net, and run the inference directly on device. The GitHub repo [0] has a good set of examples of what that looks like in practice.
The new, still experimental way is to compile your neural net into executable code with their XLA / tfcompile tool, and link that into your app. They are adding more docs on this on the TensorFlow website [1].
They don't want you to go there. Remind who started the latest big AI Projects (Google, Amazon, Microsoft, Facebook) and I don't think the will stop grabbing data.
I think they'll develop a hivemind, where mobile adds to the pool. In short Skynet ;)