Performance regression with relay (opt_level=3)

gasgallo · July 31, 2019, 3:26am

I’m having some problems when deploying this model on x86 architecture using relay.
When setting opt_level=3 performance get worse by almost 3 times compared to using opt_level=1.
Any clue why does this happen?

I’ve also tuned convolution on x86 using autotvm, but inference is still quite slow. Is there any other way I can further speed up the inference?

kevinthesun · July 31, 2019, 5:30am

Recently there are several similar issues on x86. Did you try debugger runtime to see the performance difference between opt_level=1/3?

tico · July 31, 2019, 1:14pm

Hi @kevinthesun,

I have been also working on deploying model on x86 and getting bad performance results. For example, the situation in explained in the following post:

You mentioned that recently there are issues on x86. Could you please elaborate on the current situation for x86?

Thanks

kevinthesun · July 31, 2019, 4:49pm

quantization is a different topic. @janimesh

kevinthesun · August 1, 2019, 5:02am

Can you share your tuning_option? That might be the reason why autotuning can’t get great performance.