When autotuning with batch_size more than 1 and opt_level=3 with mxnet model multiple outputs,
I encountered a performance drop in some of the output.
opt_level 0, 1 and 2 seems to be fine
The performance of tvm model batch_size=2, opt_level=3
Similarity Score for output 1 : 0.88
Similarity Score for output 2 : 0.55
Similarity Score for output 3 : 1.00
Similarity Score for output 4 : 0.86
Similarity Score for output 5 : 1.00
hmm maybe a pass or compute/schedule that kicks in only when opt_level = 3 is assuming that a batch size is one. If you have a repro, please open an issue and I can take a look.
To be clear, is this an accuracy or performance problem? Maybe both?