DiagnosticError with gradient calculation of convolutional layers

Iktsuarpokk · May 20, 2021, 3:36am

I’m trying to compute gradients of some convolutional layers. I did this by following http://lernapparat.de/transformers-pytorch-tvm/. So here is my pytorch code.

class Test(nn.Module):

    def __init__(self):
        super(Test, self).__init__()
        self.score4 = nn.Conv2d(23, 22, 1)
        self.score_fr = nn.Conv2d(22, 21, 1)

    def forward(self, inp):
        h = inp
        h = self.score4(h)
        h = self.score_fr(h)
        return h

Next step is to import it into tvm and convert the corresponding function into gradient calculation by

tvm.relay.transform.gradient()

After separating the forward calculation and backward calculation, I met this when I tried to compile my backward function.

In `main`: 
#[version = "0.0.5"]
fn (%score_fr.weight: Tensor[(21, 22, 1, 1), float32], %score_fr.bias: Tensor[(21), float32], %gr:out: Tensor[(1, 21, 13, 19), float32], %score4.bias: Tensor[(22), float32], %score4.weight: Tensor[(22, 23, 1, 1), float32], %input0: Tensor[(1, 23, 13, 19), float32], %input:captures:0: Tensor[(1, 21, 13, 19), float32], %input:captures:1: Tensor[(1, 21, 13, 19), float32], %input:captures:2: Tensor[(1, 22, 13, 19), float32], %input:captures:3: Tensor[(1, 22, 13, 19), float32]) -> (Tensor[(1, 23, 13, 19), float32], Tensor[(22, 23, 1, 1), float32], Tensor[(22), float32], Tensor[(21, 22, 1, 1), float32], Tensor[(21), float32], Tensor[(1, 21, 13, 19), float32]) {
  %0 = zeros_like(%input0);
  %1 = zeros_like(%input:captures:3);
  %2 = zeros_like(%input:captures:2);
  %3 = zeros_like(%input:captures:1);
  %4 = zeros_like(%input:captures:0);
  %5 = multiply(%input:captures:0, %gr:out);
  %6 = zeros_like(%5);
  %7 = sum(%5);
  %8 = ones_like(%7);
  %9 = expand_dims(%8, axis=0);
  %10 = expand_dims(%9, axis=1);
  %11 = expand_dims(%10, axis=2);
  %12 = expand_dims(%11, axis=3);
  %13 = broadcast_to_like(%12, %5);
  %14 = add(%6, %13);
  %15 = multiply(%14, %gr:out);
  %16 = collapse_sum_like(%15, %input:captures:0);
  %17 = add(%4, %16);
  %18 = collapse_sum_like(%17, %input:captures:1);
  %19 = add(%3, %18);
  %20 = nn.conv2d_transpose(%19, %score_fr.weight, padding=[0, 0, 0, 0]);
  %21 = add(%2, %20);
  %22 = collapse_sum_like(%21, %input:captures:3);
  %23 = add(%1, %22);
  %24 = nn.conv2d_transpose(%23, %score4.weight, padding=[0, 0, 0, 0]);
  %25 = add(%0, %24);
  %26 = zeros_like(%score4.weight);
  %27 = reshape(%input0, newshape=[1, -1, 0, 0]);
  %28 = tile(%23, reps=[1, 23, 1, 1]);
  %29 = reshape(%28, newshape=[-1, 1, 0, 0]);
  %30 = nn.conv2d(%27, %29, padding=[0, 0, 0, 0], groups=23);
  %31 = reshape(%30, newshape=[1, 23, 22, 1, 1]);
  %32 = sum(%31, axis=[0]);
  %33 = transpose(%32, axes=[1, 0, 2, 3]);
  %34 = add(%26, %33);
  %35 = zeros_like(%score4.bias);
  %36 = expand_dims(%35, axis=0, num_newaxis=3);
  %37 = layout_transform(%36, src_layout="CHWN", dst_layout="NCHW");
  %38 = sum(%21, axis=[0, 2, 3], exclude=True);
  %39 = add(%37, %38) ***tensor type `Tensor[(22, 1, 13, 19), float32]` has 4 dimensions, while `Tensor[(22), float32]` has 1 dimensions; unable to unify: `Tensor[(22, 1, 13, 19), float32]` and `Tensor[(22), float32]`; ;***
  %40 = zeros_like(%score_fr.weight);
  %41 = reshape(%input:captures:2, newshape=[1, -1, 0, 0]);
  %42 = tile(%19, reps=[1, 22, 1, 1]);
  %43 = reshape(%42, newshape=[-1, 1, 0, 0]);
  %44 = nn.conv2d(%41, %43, padding=[0, 0, 0, 0], groups=22);
  %45 = reshape(%44, newshape=[1, 22, 21, 1, 1]);
  %46 = sum(%45, axis=[0]);
  %47 = transpose(%46, axes=[1, 0, 2, 3]);
  %48 = add(%40, %47);
  %49 = zeros_like(%score_fr.bias);
  %50 = sum(%17, axis=[1], exclude=True);
  %51 = add(%49, %50);
  %52 = zeros_like(%gr:out);
  %53 = multiply(%14, %input:captures:0);
  %54 = collapse_sum_like(%53, %gr:out);
  %55 = add(%52, %54);
  (%25, %34, %39, %48, %51, %55)
}

There are two convolutional layer biases (%39 and 51%) in the module. I think their calculation methods should be exactly the same. But it is what it looks like above. The latter one looks all right but the previous looks weird (tensor type Tensor[(22, 1, 13, 19), float32] has 4 dimensions, while Tensor[(22), float32] has 1 dimensions; unable to unify: Tensor[(22, 1, 13, 19), float32] and Tensor[(22), float32]).

I run my code with both tvm 0.7 and 0.8dev, windows and ubuntu. The results were all the same. I just don’t know why this happened.

altanh · May 20, 2021, 7:58pm

Hi- do you mind sharing a full reproducible script?

Iktsuarpokk · May 24, 2021, 2:17am

I’m so sorry that I just notice your reply. I already put a full script on github https://github.com/Iktsuarpokk/tvm_grad. Sorry again.