I have a newbie question as to the title after reading the TVM paper.
In the paper, I see TVM do both graph-level and operator-level optimizations. These methods are applicable for training. But some other important topics in training are missing such as operator placement, leveraging heterogeneous devices, etc. I also searched for some applications of TVM, and I found most of them are for inference. Also, the experiment part of TVM paper concentrates in inference performance of server-side and embedded devices.
So, I wanna know if TVM is for inference just by design. Or it is also targeting the training part, but not so widely used for some reasons.