How could us use tvm.relay.transform.ToMixedPrecision?

ziyu-guo · February 18, 2022, 2:28am

Hi all, I’m also trying this pass and it seems the constant param data are not converted during compilation. Also each conv2d op is still outputting fp32 instead of fp16. As a result I got large amount of casts to convert the data (weight constants and conv2d outputs) in the relay IR prior to relay.build. Is this expected behavior?

AndrewZhaoLuo · February 23, 2022, 8:25pm

@ziyu-guo that is expected behavior. Right now the pass tries to preserve the function signatures so cannot for example change the expected input params from fp32 → fp16. Instead try binding params as constants as so: https://github.com/AndrewZhaoLuo/TVM-Sandbox/blob/main/fp16_pass/benchmark_fp16.py#L24 and then running the pass.

twmht · March 18, 2022, 8:53am

@AndrewZhaoLuo

I am trying to tune cuda-fp16 model.

But got some warnings, not sure if this would affect the performance.

Any idea?

twmht · March 18, 2022, 9:02am

@ziyu-guo

How did you know this? How can I know the output type for each layer after the converter?

AndrewZhaoLuo · March 21, 2022, 5:16pm

The error above should not affect performance I believe, it just fails some analysis but it compiles and runs anyway.
To see the output type you can run InferType() on the relay IRModule and then print it out. E.g. https://github.com/AndrewZhaoLuo/TVM-Sandbox/blob/19284ddbd6bb28af61c0c2aa8bb334c5c53731a7/relay/test_layout_transform_pass.py#L17