context: how do I use dlpackrs with apache TVM for zero-copy?
I’ve found that using tvm_graph_rt
crate, the output is of Tensor
, which has as_slice
function that returns… [u8]
.
So… is it possible to iterate over output rows using as_slice
without a full gpu-to-cpu copy?
Also, there seems to be no option to specify device ('eg. cuda(0) etc) with tvm_graph_rt
…
is this module 100% cpu-only?