|Hi TVM team,
I am writing a custom operator in TVM and adding a Tensorize schedule for its execution. I See that we can use the LLVM intrinsics(for example: llvm.aarch64.neon.uaddlp
, llvm.aarch64.neon.addp
) through tvm.tir.call_llvm_pure_intrin() call. I just wanted to know if we can use directly the instruction instead of Intrinsic(for example: %3 = mul <8 x i16> %1, %0
for vmulq_u16
).
Look forward to your reply.
Thanks