I have been working on support for calling TorchScript from TVM as a backend. This can be used fallback for when torch operators are not yet implemented or if one wants to incorporate bespoke PyTorch custom ops into TVM with ease.
My proposed implementation strategy is
- add a new relay
torchop
that takes a variable number of inputs and executes a provided TorchScript (aka PyTorch JIT) function, - add a backend module class that calls into LibTorch (aka the C++ bindings underlying PyTorch), executing a TorchScript function.
The key addition to the relay infrastructure is that while leaving num_inputs == -1
on operator registration is documented to indicate a variable number of inputs, the type inference pass is not prepared to deal with it and instead requires the number of arguments provided to match the number of arguments with add_argument
on operator registration. The proposed change to the type inference is then to match the declared arguments but to allow additional arguments if num_inputs
is -1
.
The other detail is that it uses a string attribute in the call node’s attributes to take the serialized TorchScript. This is a bit fishy as the serialized representation is binary. I used serialization to get TorchScript into TVM at the C++ level as it is tricky to interoperate between PyBind-wrapped objects (TorchScript in PyTorch) and the TVM FFI, but we might pass things around as handles later (but I’m not sure if that works well with attributes). I would be glad to have your advice on this.
Currently I only support one output tensor and FP32, but that is straightforward to make flexible, and I would do this in parallel to the more fundamental discussion e.g. around the things above.
Even though this is in draft state, I have opened PR 7401 to make discussions concrete in terms of code.
Thank you!