Relax.permute_dims not supported?

Hi,

I try to build relax module with below script:

with tvm.transform.PassContext(config={"relax.backend.use_cuda_graph": False}):
    ex = relax.build(mod, "cuda")

Then it report error:

CodeGenVM cannot handle this intrinsic now Op(relax.permute_dims)

Does it mean that relax currently don’t support permute_dims?

Relax IR is as:

@I.ir_module
class Module:
    I.module_attrs({"external_mods": [metadata["runtime.Module"][0]]})
    @R.function
    def main(x1: R.Tensor((1, 11, 200, 200), dtype="float32")) -> R.Tensor((1, 45, 100, 100), dtype="float32"):

        R.func_attr({"num_input": 1, "relax.force_pure": 1, "tir_var_upper_bound": {"a": 400, "b": 500}})
        with R.dataflow():
            lv: R.Tensor((1, 200, 200, 11), dtype="float32") = R.permute_dims(x1, axes=[0, 2, 3, 1])
            lv1: R.Tensor((45, 3, 3, 11), dtype="float32") = R.permute_dims(metadata["relax.expr.Constant"][0], axes=[0, 2, 3, 1])
            lv1_1: R.Tensor((1, 45, 1, 1), dtype="float32") = R.reshape(metadata["relax.expr.Constant"][1], R.shape([1, 45, 1, 1]))
            lv2: R.Tensor((1, 1, 1, 45), dtype="float32") = R.permute_dims(lv1_1, axes=[0, 2, 3, 1])
            lv_1 = R.call_dps_packed("fused_relax_nn_conv2d_relax_add_cutlass", (lv, lv1, lv2), out_sinfo=R.Tensor((1, 100, 100, 45), dtype="float32"))
            lv3: R.Tensor((1, 45, 100, 100), dtype="float32") = R.permute_dims(lv_1, axes=[0, 3, 1, 2])
            lv3_1: R.Shape([1, 45, 100, 100]) = R.shape_of(lv3)
            lv4: R.Tensor(lv3_1, dtype="float32") = R.broadcast_to(metadata["relax.expr.Constant"][2], lv3_1)
            lv4_1: R.Tensor((1, 100, 100, 45), dtype="float32") = R.nn.prelu(lv_1, lv4, axis=3)
            gv: R.Tensor((1, 45, 100, 100), dtype="float32") = R.permute_dims(lv4_1, axes=[0, 3, 1, 2])
            R.output(gv)
        return gv

LegalizeOps is required before calling build

mod = relax.transform.LegalizeOps()(mod)

Thx, after add Legalize, seems ok now. Another fix I borrow from test_codegen_cutlass.py is to add relax.pipeline.get_pipeline. What this transformer is for?

And with “relax.transform.LegalizeOps()(mod)”, I’d like to know in which way it fix original error?

Does it make sense to add LegalizeOps to the list of passes run in relax.build? Looks like we’ve had multiple questions with the same confusion like:

I think it’s because, relay.build used to take care of running all the required passes, but in relax, only the lowering passes are run as part of relax.build and others have to be called explicitly.

Thanks for bringing this up. We’d like to remain build as a minimal function that make the compilation path more customisable at the moment

As for the usability, we are writing Unity docs, which may resolve most of the confusion. Meanwhile I don’t think it’s necessary to keep the same behavior as Relay’s

cc @tqchen

Yes the customizability of the compilation flow makes sense, and yeah we don’t need to keep the same behavior as Relay, but the fact that there’s a difference might be the source of confusion.

One option I thought of was to let relax.build take a pipeline as an argument (perhaps set to zero_pipeline as the default) to be applied at compile time, and those who have their own pass pipeline can pass that instead. This way, atleast op legalization and fusion would get run by default if no pipeline is passed, and can always be customized by passing a proper pipeline.

Somehow I’ve always felt that a good design is one where a user can understand the flow without having to go through documentation (it’s okay when needed for customization, but perhaps the default flow can be made obvious, if that makes sense).