in this case:
free_var %input: Tensor[(1, 1, 112, 112), float32] /* ty=Tensor[(1, 1, 112, 112), float32] /;
%0 = subtract(%input, meta[relay.Constant][0] / ty=Tensor[(1, 1, 1, 1), float32] /) / ty=Tensor[(1, 1, 112, 112), float32] /;
%1 = subtract(%0, meta[relay.Constant][1] / ty=Tensor[(1), float32] /) / ty=Tensor[(1, 1, 112, 112), float32] /;
%2 = qnn.quantize(%1, 1f / ty=float32 /, 0 / ty=int32 /, out_dtype=“int8”) / ty=Tensor[(1, 1, 112, 112), int8] /;
%3 = clip(%2, a_min=-128f, a_max=127f) / ty=Tensor[(1, 1, 112, 112), int8] /;
%4 = qnn.quantize(meta[relay.Constant][2] / ty=Tensor[(32, 1, 3, 3), float32] /, 1f / ty=float32 /, 0 / ty=int32 /, out_dtype=“int8”, axis=0) / ty=Tensor[(32, 1, 3, 3), int8] /;
%5 = multiply(3.8147e-06f / ty=float32 /, meta[relay.Constant][3] / ty=Tensor[(32), float32] /) / ty=Tensor[(32), float32] /;
%6 = qnn.conv2d(%3, %4, 0 / ty=int32 /, 0 / ty=int32 /, 1f / ty=float32 /, %5, strides=[2, 2], padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3], out_dtype=“int32”);
%7 = qnn.quantize(meta[relay.Constant][4] / ty=Tensor[(32), float32] /, 1f / ty=float32 /, 0 / ty=int32 /, out_dtype=“int32”) / ty=Tensor[(32), int32] /;
nn.bias_add(%6, %7) / ty=Tensor[(1, 32, 56, 56), float32] */
quantize(1) +dequantize2 +conv2d+quantize(2)+bias:
the conv2d will make tensor affintype (x_t.sacle*w_t.scale),when conv2d is perchannel,and its tensor affintype also an tensor but not a scale,so when conv2d‘’s next node is quantize node(such as quantize(2)),we need insert an requantize node,and the output scale of requantize node is conv2d‘’s affintype.