The current implementation for ARM operates on the 3rd dimension (pack_axis=2
).
This leads to noticeably higher runtimes for shapes that have height / width shapes values smaller than 32, i.e. N = M < 32.
This is likely due to the implementation mapping poorly onto the available ARM SIMD instructions.
Is it possible to change the dimension upon which the bitpacking is performed from a spatial dimension to the channel dimension?
I have already attempted replacing the aforementioned value for the pack_axis
argument with that for the channel dimension, but I receive an error with an output I have not been able to trace.
tvm._ffi.base.TVMError: Traceback (most recent call last):
4: TVMFuncCall
3: _ZNSt17_Function_handlerIFvN3tvm7runtime7TVMArgsEPNS1_11TVMR
2: tvm::runtime::RPCWrappedFunc::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
1: tvm::runtime::RPCClientSession::CallFunc(void*, TVMValue const*, int const*, int, std::function<void (tvm::runtime::TVMArgs)> const&)
0: tvm::runtime::RPCEndpoint::CallFunc(void*, TVMValue const*, int const*, int, std::function<void (tvm::runtime::TVMArgs)>)
File "/home/bsparks/tvm/src/runtime/rpc/rpc_endpoint.cc", line 797
TVMError:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
Check failed: (code == RPCCode::kReturn) is false: code=kShutdown