Hi, I have a use case where I need to pass Any to relay ops. Concretely, I want to make the max_loop_count below in the Torch frontend Any so that we can support truly dynamic for loop.
Right now max_loop_count is always const. But since in Python this is usually the size of some axis of tensors or the length of a list, which are dynamic, requiring the loop extent to a constant is severe limitation.
If I pass Any to relay.op.less above, I get
TVMError: Check failed: ObjectTypeChecker<TObjectRef>: :Check(ptr): Expect relay.Expr but get Any
any is a dimension, it is not a value. I am confused - are you trying to pass a value as big as possible? That could be done by probably setting a very large number.
No. In my use case, Any results from tensor array stack op, which takes a list of tensors and stack them into a tensor one rank higher. Since the length of list is dynamic, the “stacked dimension” is Any. And then I want to loop over the stacked dimension.
Specifically, I want to support the following line in the Stacked LSTM example in the Torch repo.
The result of rnn_layer above is torch.stack, which in my translation to Relay gives a tensor with Any dimension. In the second iteration, the input tensor to the rnn_layer has Any as the size of the first axis, and I need to loop over this axis:
So even though Any in Relay is technically dimension, I think it makes sense to treat the dimension as value.
yes, I’ve just discovered shape_of function while looking at the test_any.py, and it worked!! Here is my new converter for aten::size(). If we discover Any as a result of infer_shape, I use shape_of function.
You probably know, but the more principled approach on LSTM is to use ADTs instead of dynamic shape tensor IMO.
Doing so unlock optimization possibility such as loop fusion, dynamic batching and partial evaluation.
Yeah, right now my focus is to translate readily available Pytorch LSTM models. Python only has list and tuples so I think I should stick with them for now. I’m also curious about “Tree LSTM” variant, although I don’t know what that is or how PyTorch people implements it.