I found TVM support HVX in Hexagon by using LLVM intrin, if it means LLVM can not codeGen HVX related intrin. Another Question is If LLVM Hexagon intrin support vrmpyz inst, which can do 4-times vrmpy in one package, I can not found the docs LLVM Hexagon intrin official.
Hi, very good question! I had the same question several months ago. I was trying to optimise hexagon computation in context of TVM and int8 models. Finally I’ve come to that it’s an unpromising way in context of modern Qualcomm DSP.
The first aspect. “vrmpyz” instruction is a private IP named Hexagon Tensor Accelerator (HTA). So that’s not a part of public ISA and as result it can be deprecated on hardware level by manufacturer without any notification. I’m not sure about particular list of Hexagon DSP which support “vrmpyz”: v65 - do not support, v66 - support, v67 - I guess it support, v68 and later - don’t know,
So if you like to use “vrmpyz” you should know particular version of your target DSP, and give up portability. And you will accelerate inference only for limited set of old devices which support HTA.
The second aspect. There is no official documentation. Your efforts will look like some kind of reverse engineering. If something goes wrong it will be quite hard to define the reason of it.
The third aspect. Latest Qualcomm DSP has more advanced DSP accelerator technic called HMX. Which is more powerful and power efficient. So it looks like more promising than previous generation like HTA.
All public versions of LLVM I tried locally doesn’t support “vrmpyz”. At least I was not success with that. They doesn’t expose these instructions via T.llvm_lookup_intrinsic_id("llvm.hexagon.V6.vrmpyzbb.rt")
interface. Meanwhile you can fined some related code in LLVM sources, but I have no idea how it can be used.
Anyway, I guess Qualcomm engineers may answer your questions more detailed.
The vrmpyz
instructions were a temporary addition to the HVX ISA. They only really exist in HVX v66. HVX v68 might still have it, but it’s already considered deprecated in v68.