A Tensor Array usage question: Stacking a list of fixed sized tensors

Hi, I’m trying to translate PyTorch RNN models to Relay via WIP PyTorch frontend. The example I’m working is available below in PyTorch repo:

The approach I tried is to translate variable length list append + stack above to Relay prelude list cons, reverse and tensor array stack. The translated IR below looks reasonable except the Tensor[(?, ?), float32] which I guess comes from using tensor array types. Because of this “Any type” tensor I get a type check error at %11 = nn.dense(%8, %10, units=None).

fn (%X: Tensor[(5, 2, 3), float32], %v29: Tensor[(16, 3), float32], %v32: Tensor[(16), float32], %v34: Tensor[(16, 4), float32], %v38: Tensor[(16), float32], %states: (Tensor[(?, ?), float32], Tensor[(2, 4), float32])) -> (tensor_float32_t[], (Tensor[(?, ?), float32], Tensor[(2, 4), float32])) {
  %0 = Nil;
  %34 = (
    let %while_loop: fn (int32, List[tensor_float32_t[]], (Tensor[(2, 4), float32], Tensor[(2, 4), float32])) -> (int32, List[tensor_float32_t[]], (Tensor[(2, 4), float32], Tensor[(2, 4), float32])) = fn (%i.1: int32, %outputs.6: List[tensor_float32_t[]], %state.6: (Tensor[(?, ?), float32], Tensor[(2, 4), float32])) -> (int32, List[tensor_float32_t[]], (Tensor[(2, 4), float32], Tensor[(2, 4), float32])) {
      %1 = less(%i.1, 5);
      if (%1) {
        %2 = add(%i.1, 1);
        %3 = take(%X, %i.1, axis=0);
        %4 = transpose(%v29, axes=[1, 0]);
        %5 = transpose(%4, axes=[1, 0]);
        %6 = nn.dense(%3, %5, units=None);
        %7 = add(%6, %v32);
        %8 = %state.6.0;
        %9 = transpose(%v34, axes=[1, 0]);
        %10 = transpose(%9, axes=[1, 0]);
        %11 = nn.dense(%8, %10, units=None) an internal invariant was violated while typechecking your program [19:31:38] /home/masa/projects/dev/tvm/src/tir/ir/expr.cc:184: Check failed: lanes > 1 (0 vs. 1) : 
; an internal invariant was violated while typechecking your program [19:31:38] /home/masa/projects/dev/tvm/src/tir/ir/expr.cc:184: Check failed: lanes > 1 (0 vs. 1) : 
; ;
        %12 = add(%7, %11);
        %13 = add(%12, %v38);
        %14 = strided_slice(%13, begin=[0, 12], end=[2, 16], strides=[1, 1]);
        %15 = sigmoid(%14);
        %16 = strided_slice(%13, begin=[0, 4], end=[2, 8], strides=[1, 1]);
        %17 = sigmoid(%16);
        %18 = %state.6.1;
        %19 = multiply(%17, %18);
        %20 = strided_slice(%13, begin=[0, 0], end=[2, 4], strides=[1, 1]);
        %21 = sigmoid(%20);
        %22 = strided_slice(%13, begin=[0, 8], end=[2, 12], strides=[1, 1]);
        %23 = tanh(%22);
        %24 = multiply(%21, %23);
        %25 = add(%19, %24);
        %26 = tanh(%25);
        %27 = multiply(%15, %26);
        %28 = (%27, %25);
        %29 = (%27, %28);
        %30 = %29.0;
        %31 = tensor2_float32(%30);
        %32 = Cons(%31, %outputs.6);
        %33 = %29.1;
        %while_loop(%2, %32, %33)
      } else {
        (%i.1, %outputs.6, %state.6)
      }
    };
    %while_loop
  );
  %35 = %34(0, %0, %states);
  %36 = %35.1;
  %37 = @rev(%36);
  %38 = @tensor_array_stack_float32(%37);
  %39 = %35.2;
  (%38, %39)
}

What I need is a list stack op for variable number of fixed size tensors (similar to torch.stack). I’m having a feeling that tensor_array_stack is not the right one to use for my use case.

How should I go about this? Should I define a function similar to tensor_array_stack in prelude but for fixed size tensors?

cc @wweic

@masahi One approach is you can convert your fixed size tensor into dynamic tensor(use tensorN where N is the rank of your tensor), then the type is list[tensor_t] , which is exactly the type of tensor_array, then you can use tensor_array_stack. Afterwards you can use pattern match to exact the tensor value out. This is requires less work, but based on experience from us, fully dynamic tensor array hurts performance. So we are thinking to support list of fixed size tensor exactly as you proposed.

I’m thinking a relay code generator that accepts the fixed shape, it will generate all tensor array ops for list of tensors of this shape, and register the ops in prelude. We can start from tensor_array_stack. Does this make sense?

1 Like

Yes the code generator approach makes sense. I think PyTorch frontend can greatly benefit from fixed size tensor list.

I am already using p.get_var('tensor2', "float32") and p.get_var('tensor_array_stack', "float32") in my implementation and the IR above, but I was stuck at a type check error due to Any type and couldn’t figure out how to “unwrap” tensor_t type to extract orignal Tensor type. Thanks for suggesting pattern match to extract tensors, I’ll try that. At this point performance is not important for me, I just want the first example working :slight_smile:

@masahi For the unwrap functions, try the code in this PR(https://github.com/apache/incubator-tvm/pull/4325/files#diff-d06eed9196cdb2377458f6c3d327396eR141-R157). I closed it because I want to implement the fixed size tensor array, but you can use it for get your first POC work first.

1 Like