Is there a way to attach the TensorIR implemtation to “relay.Call” in a more direct way?
“relay.call_lowered()”https://github.com/apache/tvm/pull/9312/ seems to be an option, but how to link it to the relay.Call is currently unclear. If this is the way to go, could you provide a code snippet?
What you want is exactly what we are implementing in Relax (Relay Next)! One key design decision we made in Relax is to allow the high-level IR to be able to directly interact and call into lower-level TensorIR and PackedFunc. To achieve this, we introduced two intrinsics to bridge the gap.
The TensorIR functions and many external libraries adopt a destination passing convention(we need to explicitly allocate the output and pass it in as an argument to the function), so we introduce call_tir, which is an intrinsic that allows user to call a TIR function or a packed function in an immutable way. The second intrinsic we introduced is call_packed which indicates a call to a packed function.
In the code snippet below, we implemented a TensorIR PrimFunc in TVMScript and directly call it inside the relax function with call_tir. For more context and implementation details, please refer to our design doc.
We will send out a forum discussion post to introduce Relax soon. It will talk about our high-level goals, highlights, development plans, and so on. Please feel free to check out our presentation at TVMCon.
As far as I know, before Relax officially take out we have to implement relay_to_tir plugin to “bring your own lowering”. This could be just a relay pass to rewrite those relay calls which we are interested into call_lowered.
Basically I think we need:
Provide your own relay op strategy for TensorIR. Currently relay strategy machanism is only about TE compute and schedule yet.
Interact with your own op strategy in relay_to_tir pass, get PrimFunc for each relay call you want to lower via the strategy, then rewrite original expr to call_lowered of the primfunc.
Also if you want to lower fused functions, you have to determine how to compose PrimFuncs of sub-ops into a single PrimFunc. That could be a lot of things, but generally we can follow how TECompiler chain TE expressions of fused sub-ops.
Thanks everybody for the responses! Helped a lot. There is one detail that is still open in my understanding:
I’d like to import the NN from an onnx file, then use the relax operator that calls the TensorIR primfunc, as proposed in the previous responsens.
How can I use an onnx or TF file as an input here? Is there a way using the current frontend flow? Or do I have to describe the entire NN in relax?
Hi @MJKlaiber, we are working on a relay to relax translator: https://github.com/tlc-pack/relax/pull/63, and put up a resnet demo potentially next week, so that you can import your model to relay first and translate that to relax, and call your hand-written TensorIR Primfunc in Relax. I will keep you updated about the progress.
@MJKlaiber, Thank you and Dennis for joining today’s Relax open dev meeting and showing great interest in Relax! The translator is implemented here: https://github.com/tlc-pack/relax/pull/75, and we will merge it later this week or early next week. Will keep you posted!