A PyTorch and TVM version incompatibility problem while using from_dlpack and to_dlpack

sqPoseidon · April 21, 2023, 4:24am

I met a tool version incompatibility problem when I used from_dlpack and to_dlpack to speed up the memory copy. Hope this helps you.

I used PyTorch 1.13.1 and TVM 0.11.1. I wanted to use dlpack .

My code:

input_tvm = tvm.nd.from_dlpack(torch.utils.dlpack.to_dlpack(example_input_tensor))

Then I got the error message:

Check failed: ret == 0 (-1 vs. 0) : 
Assert fail: ((((1 == int32(arg.input.strides[3])) 
&& (32 == int32(arg.input.strides[2]))) 
&& (1024 == int32(arg.input.strides[1]))) 
&& (16384 == int32(arg.input.strides[0]))), 
arg.input.strides: expected to be compact array

I checked the strides of the torch tensor like this:

assert example_input_tensor.stride() == torch.utils.dlpack.from_dlpack(input_tvm.to_dlpack()).stride()

They are not equal. In another two-dimension input case (input shape (1, 32)), the stride values are (32, 1) and (1, 1), respectively. That’s why we got the bug shown above.

When I changed the input shape to (2, 32), then they were both (32, 1).

I downgraded the PyTorch version to 1.12.0, and the problem is gone.

masahi · April 22, 2023, 10:24am

I’m having the same problem. I want to pass PT tensors to TVM efficiently, but strides are somehow corrupted if I use tvm.runtime.ndarray.from_dlpack(to_dlpack(tensor)).

The error I get from TVM:

  File "/Users/masa/projects/dev/tvm/src/runtime/library_module.cc", line 87
TVMError: Assert fail: T.int64(1) == arg_p_inp_0_strides[1] and T.int64(77) == arg_p_inp_0_strides[0], arg.p_inp_0.strides: expected to be compact array

Maybe this is a PT problem. Here is a weird demonstration:

# from torch.utils.dlpack import to_dlpack, from_dlpack
In [52]: a = torch.randint(0, 100, (1, 77), dtype=torch.int32)

In [53]: b = from_dlpack(to_dlpack(a))

In [54]: a.stride()
Out[54]: (77, 1)

In [55]: b.stride()
Out[55]: (1, 1)

cc @tqchen

masahi · April 22, 2023, 10:25am

A weird thing is, I’ve used exactly the same way, tvm.runtime.ndarray.from_dlpack(to_dlpack(tensor)), to convert PT tensors to TVM before, and this script still works today.

github.com

masahi/torchscript-to-tvm/blob/master/stable-diffusion/compile.py#L88-L94


self.rt_mod.set_input(
    "latent_model_input", convert_to_ndarray(latent_model_input)
)
self.rt_mod.set_input("timestep", timestep.numpy())
self.rt_mod.set_input(
    "text_embedding", convert_to_ndarray(encoder_hidden_states)
)

So I have no idea what’s going on here.

masahi · April 22, 2023, 10:32am

Opened a discuss post in the PT forum

masahi · April 24, 2023, 9:26am

Ok it seems this is an expected behavior: https://github.com/pytorch/pytorch/issues/99803#issuecomment-1519632449

tqchen · April 24, 2023, 12:51pm

We can enhance the check in the TVM side to still prove this as continuous to check either the strides matches the shape product, or size ==1(which means strides does not matter

Alternatively, we can create a normalized strides in tvm runtime From DLPack, which either is more convenient.

masahi · April 24, 2023, 8:19pm

Ok I’ll work on that after our CI is updated to PT 2.0.

masahi · May 7, 2023, 8:06pm

Fixed by https://github.com/apache/tvm/pull/14797