context: how do I use dlpackrs with apache TVM for zero-copy?
I’ve found that using tvm_graph_rt crate, the output is of Tensor, which has as_slice function that returns… [u8].
So… is it possible to iterate over output rows using as_slice without a full gpu-to-cpu copy?
Also, there seems to be no option to specify device ('eg. cuda(0) etc) with tvm_graph_rt…
is this module 100% cpu-only?