Graph optimization not exist in GPU?

aleczhanshi · October 1, 2020, 1:21am

This might be a pretty naive question, but I was wondering why the graph optimization (tune_graph with graph_tuner) exists in the x86 tutorial but is missing in the CUDA example.

Any help is appreciated!

comaniac · October 1, 2020, 1:29am

You’re right. Graph tuner only supports CPU.

aleczhanshi · October 1, 2020, 5:23pm

Thanks for the quick response! Does GPU not need graph-level optimization or TVM is relying on CuDNN for this part?

comaniac · October 1, 2020, 5:40pm

No. TVM also generates CUDA code. It’s because only X86 has NCHW[c]c layout that needs graph tuning to optimize layout transform overhead between different NCHW[x]c layouts. In GPU, all conv2d are in NCHW layout, so we don’t have layout transform overhead between ops, so we don’t need graph tuner.

aleczhanshi · October 2, 2020, 10:41am

Thanks for the explanation! Without the graph tuner, how does TVM do the graph-level optimization, such as operator fusion? Please bear with my naive questions

comaniac · October 2, 2020, 4:31pm

Graph tuner is different from graph optimization. Graph optimization such as operator fusion and constant folding are mostly target independent and are applied during the compilation. In other words, as long as you compile a model using TVM, graph optimizations are involved already.

aleczhanshi · October 2, 2020, 5:30pm

Ah I see. Could you provide me some quick pointers to the code where graph optimizations are implemented and used? I really appreciate it!

comaniac · October 2, 2020, 5:38pm

You can first go through this tutorial to get familiar with the code base: https://tvm.apache.org/docs/dev/codebase_walkthrough.html

Then you can trace Relay/TIR passes for such optimizations.

aleczhanshi · October 4, 2020, 2:21am

Thanks! One last quick question: is it easy to locate which line of code in the tutorial applies the graph optimization?

aca88 · October 5, 2020, 7:21am

with tvm.transform.PassContext(opt_level=3):
    lib =relay.build_module.build(mod, target=target, params=params)

The Relay build process eventually calls the graph optimization pipeline.

aleczhanshi · October 5, 2020, 3:11pm

Thanks for the pointer!