Hi all, Iām also trying this pass and it seems the constant param data are not converted during compilation. Also each conv2d op is still outputting fp32 instead of fp16. As a result I got large amount of casts to convert the data (weight constants and conv2d outputs) in the relay IR prior to relay.build. Is this expected behavior?
@ziyu-guo that is expected behavior. Right now the pass tries to preserve the function signatures so cannot for example change the expected input params from fp32 ā fp16. Instead try binding params as constants as so: https://github.com/AndrewZhaoLuo/TVM-Sandbox/blob/main/fp16_pass/benchmark_fp16.py#L24 and then running the pass.
I am trying to tune cuda-fp16 model.
But got some warnings, not sure if this would affect the performance.
Any idea?
How did you know this? How can I know the output type for each layer after the converter?
-
The error above should not affect performance I believe, it just fails some analysis but it compiles and runs anyway.
-
To see the output type you can run
InferType()on the relay IRModule and then print it out. E.g. https://github.com/AndrewZhaoLuo/TVM-Sandbox/blob/19284ddbd6bb28af61c0c2aa8bb334c5c53731a7/relay/test_layout_transform_pass.py#L17
