How to use FoldConstantExpr?

Hello,

I am trying to import a conv2d_transpose operation and generate code for it. When I import from TFLite I get a graph like this:

fn (%serving_default_input_1:0: Tensor[(1, 512, 256, 16), int8] /* ty=Tensor[(1, 512, 256, 16), int8] span=serving_default_input_1:0:0:0 */, %v_param_1: Tensor[(8, 8, 16, 32), int8] /* ty=Tensor[(8, 8, 16, 32), int8] span=sequential/conv2d/Conv2D:0:0 */, %v_param_2: Tensor[(32), int32] /* ty=Tensor[(32), int32] span=sequential/conv2d/BiasAdd/ReadVariableOp:0:0 */, %v_param_3: Tensor[(8, 8, 32, 64), int8] /* ty=Tensor[(8, 8, 32, 64), int8] span=sequential/conv2d_1/Conv2D:0:0 */, %v_param_4: Tensor[(64), int32] /* ty=Tensor[(64), int32] span=sequential/conv2d_1/BiasAdd/ReadVariableOp:0:0 */, %v_param_5: Tensor[(8, 8, 64, 128), int8] /* ty=Tensor[(8, 8, 64, 128), int8] span=sequential/conv2d_2/Conv2D:0:0 */, %v_param_6: Tensor[(128), int32] /* ty=Tensor[(128), int32] span=sequential/conv2d_2/BiasAdd/ReadVariableOp:0:0 */, %v_param_7: Tensor[(8, 8, 128, 256), int8] /* ty=Tensor[(8, 8, 128, 256), int8] span=sequential/conv2d_3/Conv2D:0:0 */, %v_param_8: Tensor[(256), int32] /* ty=Tensor[(256), int32] span=sequential/conv2d_3/BiasAdd/ReadVariableOp:0:0 */, %v_param_9: Tensor[(512, 8192), int8] /* ty=Tensor[(512, 8192), int8] span=sequential/dense/MatMul1:0:0 */, %v_param_10: Tensor[(512), int32] /* ty=Tensor[(512), int32] span=sequential/dense/BiasAdd/ReadVariableOp:0:0 */, %v_param_11: Tensor[(1, 256, 8, 8), int8] /* ty=Tensor[(1, 256, 8, 8), int8] span=sequential/conv2d_transpose/conv2d_transpose:0:0 */, %v_param_12: Tensor[(256), int8] /* ty=Tensor[(256), int8] span=sequential/conv2d_transpose/BiasAdd/ReadVariableOp:0:0 */, %v_param_13: Tensor[(256, 128, 8, 8), int8] /* ty=Tensor[(256, 128, 8, 8), int8] span=sequential/conv2d_transpose_1/conv2d_transpose:0:0 */, %v_param_14: Tensor[(128), int8] /* ty=Tensor[(128), int8] span=sequential/conv2d_transpose_1/BiasAdd/ReadVariableOp:0:0 */, %v_param_15: Tensor[(128, 64, 8, 8), int8] /* ty=Tensor[(128, 64, 8, 8), int8] span=sequential/conv2d_transpose_2/conv2d_transpose:0:0 */, %v_param_16: Tensor[(64), int8] /* ty=Tensor[(64), int8] span=sequential/conv2d_transpose_2/BiasAdd/ReadVariableOp:0:0 */, %v_param_17: Tensor[(64, 32, 8, 8), int8] /* ty=Tensor[(64, 32, 8, 8), int8] span=sequential/conv2d_transpose_3/conv2d_transpose:0:0 */, %v_param_18: Tensor[(32), int8] /* ty=Tensor[(32), int8] span=sequential/conv2d_transpose_3/BiasAdd/ReadVariableOp:0:0 */, %v_param_19: Tensor[(1, 1, 32, 1), int8] /* ty=Tensor[(1, 1, 32, 1), int8] span=sequential/conv2d_4/Conv2D:0:0 */, %v_param_20: Tensor[(1), int32] /* ty=Tensor[(1), int32] span=sequential/conv2d_4/BiasAdd/ReadVariableOp:0:0 */, output_tensor_names=["StatefulPartitionedCall_0"]) -> Tensor[(1, 512, 256), int8] {
span=sequential/reshape/Reshape:0:0 */;
  %62 = layout_transform(%v_param_11, src_layout="IOHW", dst_layout="HWIO") /* ty=Tensor[(8, 8, 1, 256), int8] */;
  %63 = qnn.conv2d_transpose(%61, %62, -128 /* ty=int32 span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */, 0 /* ty=int32 span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */, 1.33034f /* ty=float32 span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */, 0.000264388f /* ty=float32 span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */, channels=256, kernel_size=[8, 8], strides=[2, 2], padding=[3i64, 3i64, 3i64, 3i64], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 64, 32, 256), int32] span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */;
  %64 = qnn.requantize(%63, 0.000351727f /* ty=float32 span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */, 0 /* ty=int32 span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */, 0.183441f /* ty=float32 span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */, 1 /* ty=int32 span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */, axis=3, out_dtype="int8") /* ty=Tensor[(1, 64, 32, 256), int8] span=sequential/conv2d_transpose/conv2d_transpose11:0:0 */;
  %65 = expand_dims(%v_param_12, axis=0, num_newaxis=3) /* ty=Tensor[(1, 1, 1, 256), int8] */;
  %66 = qnn.add(%64, %65, 0.183441f /* ty=float32 span=sequential/conv2d_transpose/Relu;sequential/conv2d_transpose/BiasAdd:0:0 */, 1 /* ty=int32 span=sequential/conv2d_transpose/Relu;sequential/conv2d_transpose/BiasAdd:0:0 */, 2.13872e-05f /* ty=float32 span=sequential/conv2d_transpose/Relu;sequential/conv2d_transpose/BiasAdd:0:0 */, -93 /* ty=int32 span=sequential/conv2d_transpose/Relu;sequential/conv2d_transpose/BiasAdd:0:0 */, 0.0908591f /* ty=float32 span=sequential/conv2d_transpose/Relu;sequential/conv2d_transpose/BiasAdd:0:0 */, -128 /* ty=int32 span=sequential/conv2d_transpose/Relu;sequential/conv2d_transpose/BiasAdd:0:0 */) /* ty=Tensor[(1, 64, 32, 256), int8] span=sequential/conv2d_transpose/Relu;sequential/conv2d_transpose/BiasAdd:0:0 */;
  %67 = clip(%66, a_min=-128f, a_max=127f) /* ty=Tensor[(1, 64, 32, 256), int8] span=sequential/conv2d_transpose/Relu;sequential/conv2d_transpose/BiasAdd:0:0 */;
} 

I’ve truncated to only show the interesting part here. Now I need to fold the layout_transform and expand_dims as they mess with my pattern matching. I think FoldConstantExpr is the transformation I need to use here, but I can’t figure out how to describe the Expressions here. Can someone give me an example here?

Hi @Necrotos,

It it seems like the constants in your graph are currently represented using relay.Var’s which will mean constant folding won’t be able to recognise the constants and therefore won’t perform any optimization. There is a related discussion here that might help: Constant params should be constants - #3 by lhutton1

I think if you used the bind_params_by_name functionality (mentioned in the link above) then run the relay.transform.FoldConstant pass on the module, it should solve your issue.

Hello @lhutton1,

thank you very much for this! It did indeed fix my issue. Do you know what causes the parameters to be defined as a Var? Is there something TVM expects me to do before/after importing from TFLite?