[RFC] TVM Documentation Refactor

areusch · August 4, 2021, 10:08pm

@hogepodge i agree it doesn’t make sense given our current tutorials. But i think many of the tutorials need to be deleted and rewritten. for example, consider “Compile Deep Learning Models” doctree:

Compile Deep Learning Models

Compile PyTorch Models - teaches how to import from PyTorch then build the model, run it on CPU, then lookup the classification output in a table.
Compile Tensorflow Models - ingest a model into TensorFlow, use PIL to decode an image, import the TF model into TVM, then build and run either on CPU or GPU, then lookup classification output in a table
Compile MXNet Models - download resnet18, import into TVM, compile using CUDA, execute on GPU, then lookup classification output in a table. finally explains how to use aux params.
Compile ONNX Models - download an image scaling model, import into TVM, compile, run on CPU, format the output using PIL/pyplot
Compile Keras Models - download Resnet50, import into TVM, compile using CUDA, execute on GPU, then lookup classification output in a table.
Compile TFLite Models - download mobilenet, import into tvm, load an image with PIL, compile to CPU, run inference, lookup classification output in a table
…

80% of each of those tutorials is repeated. Meanwhile, they don’t even start to talk about how to get the dependencies installed properly. The TensorFlow tutorial references (without linking to) some non-existent page docs/frontend/tensorflow.md which ostensibly should have explained this. These tutorials are grouped under “compile deep learning models” but most of their content is demonstrating how to get the end of model inference and show some result of running the model rather than actually explaining various caveats of compiling it.

The tutorials are not really helping past basic comprehension here. They could be much smaller or more targeted:

Just show how to install each framework’s dependencies so TVM can use it, and how to get a Relay model
show how to compile a relay model for various platforms
show how to run a variety of different models (e.g. mobilenet, resnet50, etc). we could probably just show some of these on CPU to make it possible to run the tutorials without needing a GPU.

if we do this split, then having the landing pages makes way more sense, because they sew together each of these smaller tutorials. i do still think it makes sense to have some end-to-end examples, and even the intro tutorial you put together could be a good one of these. I don’t think we can continue to have the cross-product of all possible models from all possible frameworks to all possible targets. I would like us to refactor the tutorial and then write higher-level documentation to explain the process (with pictures) at a high level in terms of core compiler data structures (e.g. Relay IRModule, runtime.Module, etc) and then provide a tutorial to give detailed steps for each logical step in the process. Doing this makes it easier to leap from basic understanding of tutorial to basic understanding of compiler structure.