Question about what TVM does

philipturner · December 24, 2021, 10:15pm

Does TVM support training models on Apple devices, or only inference? There’s a lot of work out there about accelerating inference, but that’s not what I need. I’m looking to accelerate a resurrected Swift for TensorFlow (a framework for ML training, not inference) with Metal, and I’m not clear whether TVM is something I can use for that.

The reason I ask is that I saw OctoML running Bert on M1 extremely quickly (https://github.com/octoml/Apple-M1-BERT), but they didn’t clarify whether it was only for inference. I have tried researching into whether TVM is only inference before, but I never found a clear answer.

In addition, I need to be able to train models on iOS, so I can’t depend on any Python code when S4TF runs on iOS.

masahi · December 27, 2021, 7:39pm

For now, we only support inference. But the community is definitely interested in training support and some people are already working on it. There are some related talks during the TVMcon (recordings will be uploaded next year soon).

philipturner · December 28, 2021, 1:34pm

@masahi I think my effort to create MetalXLA would be the perfect opportunity to experiment with using AutoTVM to accelerate training. It’s a real-time ML context where you have to balance compilation cost with code optimization. Also, you would either compete with or work with MPSGraph, giving a realistic scenario where other framework’s compilers might sometimes be better than TVM. Instead of CUDA XLA or PyTorch, which are relatively established, this backend is very open to change. I could even add features just to help out with TVM experimentation.

Also, the timeframe for when such experimentation will happen is perfect. There’s a several month gap between now and when both S4TF (may) be resurrected and I finish some collaboration with PyTorch on ops such as 3D convolutions. This gives ample time for you and others at TVM to debate whether it’s a good investment. I will also develop MetalSLC*, which is vital data for an AI algorithm concerned with predicting performance.

*Can’t provide a link because of this forum’s restriction on new users.

I read this research paper on using ML to predict computational cost of models: [1811.11880] Predicting the Computational Cost of Deep Learning Models. That research only focused on NVIDIA GPUs. Several other parties are recently making GPUs with good ML capabilities (Intel, Imagination, Apple) besides NVIDIA. Investing time into experimenting with a Metal project would help break the ML community out of the walled garden of NVIDIA.