LegalizeOps: can we register ops legalization like in relay?

slai-nick · May 5, 2023, 9:48am

I still have a lot to learn with both unity and pre-unity, I’ve learnt yesterday how to map relax ops to PrimFunc using LegalizeOps(customize_legalize_map: Optional[Dict[str, LegalizeFunc]).

I would like to add legalization for my BYOC target in unity, I see the pipeline seems to not pass anything to LegalizeOps so I am guessing there might be a way to register it, like in relay.

Otherwise when doing BYOC in unity how do I make sure my customized legalization gets called?

sunggg · May 8, 2023, 4:09am

Hi, @slai-nick. Thank you for the question. LegalizeOps is for lowering each op to PrimFunc implementation which serves as an internal path as opposed to BYOC which is for an external path.

To offload to your BYOC target, you can define the pattern of your target ops and perform RunCodegen pass as in this test caes: https://github.com/apache/tvm/blob/unity/tests/python/relax/test_codegen_tensorrt.py#L98

This tutorial also walkthroughs how these passes work: [Unity][Tutorial] TVM Unity BYOC

slai-nick · May 9, 2023, 8:09am

Thank you for your answer.

I believe that I am mixing code generation concepts with lowering for my target…

I have been trying to find a way to map high level relax operation straight to low level TIR (without TE+scheduling) for my BYOC target (which is on a new device by the way), and I though LegalizeOps was the way to do it. Is there any way I can do it then?

Edit: Also, is registering relax operator strategy any different from relay?

Kevin-XiongC · May 9, 2023, 8:30am

I think in Relax, the lowering process becomes more programmable like below. You can mix TIR and extern using TensorScript.

slai-nick · May 9, 2023, 10:33am

I see. Can this be confirmed?

After preparing the module for my codegen using FuseOpsByPattern(patterns) and MergeCompositeFunctions(), I can run RunCodegen() and my BYOC will have as entry the annotated relax function(s). These functions contains a series of high level relax operators, and I can perform lowering to TIR if necessary by implementing graph rewriting in the codegen or eventually before calling RunCodegen().

sunggg · May 11, 2023, 7:26pm

If you are trying to prioritize your TIR over the one in LegalizeOps, you can write a pass that converts your target op to your TIR and apply it before LegalizeOps pass. The implementation should be very similar to LegalizeOps, only difference is the mapping between Relax ops and TIR. This way, we can keep our compilation pipeline composable.

If you want to write down the IRModule directly, you can also do that like @Kevin-XiongC pointed out.

RunCodegen targets Relax-level. So inside, without going through TIR, it directly converts it to external target’s equivalent and compiles it. e.g., relax.conv2d → tensorrt.conv2d then, compiles. Is there any specific reason you want to go through TIR? We do have such mechanism as well, but it is using different paths.

slai-nick · May 12, 2023, 10:09am

Thank you for the detailed answer.

It would be for low level optimisations. My plan currently is to setup a working template that remains high level, then move the low level logic into TVM.

Long term I believe using TE+scheduling will be the way. Short term I am planning for the case where I have very specific needs (e.g. packing) that I don’t know how to express with TE+scheduling. So eventually writing in TIR directly could be a temporary solution while I am grinding the TE+scheduling learning curve.

tqchen · May 12, 2023, 2:03pm

In case you are interested we do express packing in TensorIR and scheduling, checkout the tensor IR tutorials that may be helpful