How could us use tvm.relay.transform.ToMixedPrecision?

Hi all, I’m also trying this pass and it seems the constant param data are not converted during compilation. Also each conv2d op is still outputting fp32 instead of fp16. As a result I got large amount of casts to convert the data (weight constants and conv2d outputs) in the relay IR prior to relay.build. Is this expected behavior?

@ziyu-guo that is expected behavior. Right now the pass tries to preserve the function signatures so cannot for example change the expected input params from fp32 → fp16. Instead try binding params as constants as so: https://github.com/AndrewZhaoLuo/TVM-Sandbox/blob/main/fp16_pass/benchmark_fp16.py#L24 and then running the pass.

@AndrewZhaoLuo

I am trying to tune cuda-fp16 model.

But got some warnings, not sure if this would affect the performance.

Any idea?

@ziyu-guo

How did you know this? How can I know the output type for each layer after the converter?

  1. The error above should not affect performance I believe, it just fails some analysis but it compiles and runs anyway.

  2. To see the output type you can run InferType() on the relay IRModule and then print it out. E.g. https://github.com/AndrewZhaoLuo/TVM-Sandbox/blob/19284ddbd6bb28af61c0c2aa8bb334c5c53731a7/relay/test_layout_transform_pass.py#L17

1 Like