(venv) kaosnew:benchmark sam$ python3 gpu_imagenet_bench.py --model gfx900 --target metal
--------------------------------------------------
Network Name Mean Inference Time (std dev)
--------------------------------------------------
Cannot find config for target=metal -model=gfx900, workload=('conv2d_nchw_winograd.cuda', ('TENSOR', (1, 64, 56, 56), 'float32'), ('TENSOR', (64, 64, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=metal -model=gfx900, workload=('conv2d_nchw_winograd.cuda', ('TENSOR', (1, 128, 28, 28), 'float32'), ('TENSOR', (128, 128, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=metal -model=gfx900, workload=('conv2d_nchw_winograd.cuda', ('TENSOR', (1, 256, 14, 14), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=metal -model=gfx900, workload=('conv2d_nchw_winograd.cuda', ('TENSOR', (1, 512, 7, 7), 'float32'), ('TENSOR', (512, 512, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Traceback (most recent call last):
File "gpu_imagenet_bench.py", line 84, in <module>
benchmark(network, target)
File "gpu_imagenet_bench.py", line 37, in benchmark
graph, lib, params = relay.build(net, target=target, params=params)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/relay/build_module.py", line 251, in build
graph_json, mod, params = bld_mod.build(mod, target, target_host, params)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/relay/build_module.py", line 120, in build
self._build(mod, target, target_host)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py", line 213, in __call__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (8) 9 libtvm.dylib 0x00000001177d28b9 tvm::NodeFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>*)>::operator()(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>*) const + 297
[bt] (7) 8 libtvm.dylib 0x00000001177d4268 tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>::InitVTable()::'lambda4'(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>*)::__invoke(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>*) + 24
[bt] (6) 7 libtvm.dylib 0x00000001177d09b2 tvm::relay::ScheduleGetter::VisitExpr_(tvm::relay::CallNode const*) + 722
[bt] (5) 6 libtvm.dylib 0x00000001177cf97c tvm::relay::ScheduleGetter::VisitExpr(tvm::RelayExpr const&) + 252
[bt] (4) 5 libtvm.dylib 0x00000001177d2602 tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>::VisitExpr(tvm::RelayExpr const&) + 226
[bt] (3) 4 libtvm.dylib 0x00000001177d28b9 tvm::NodeFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>*)>::operator()(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>*) const + 297
[bt] (2) 3 libtvm.dylib 0x00000001177d4268 tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>::InitVTable()::'lambda4'(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>*)::__invoke(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Array<tvm::te::Tensor, void> (tvm::RelayExpr const&)>*) + 24
[bt] (1) 2 libtvm.dylib 0x00000001177d0ee3 tvm::relay::ScheduleGetter::VisitExpr_(tvm::relay::CallNode const*) + 2051
[bt] (0) 1 libtvm.dylib 0x0000000117926e25 std::__1::__function::__func<TVMFuncCreateFromCFunc::$_2, std::__1::allocator<TVMFuncCreateFromCFunc::$_2>, void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 213
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py", line 78, in cfun
rv = local_pyfunc(*pyargs)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/relay/backend/compile_engine.py", line 250, in lower_call
op, call.attrs, inputs, ret_type, target)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/relay/backend/compile_engine.py", line 183, in select_implementation
all_impls = get_valid_implementations(op, attrs, inputs, out_type, target)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/relay/backend/compile_engine.py", line 124, in get_valid_implementations
strategy = fstrategy(attrs, inputs, out_type, target)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/target/generic_func.py", line 45, in __call__
return _ffi_api.GenericFuncCallFunc(self, *args)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py", line 213, in __call__
raise get_last_ffi_error()
[bt] (5) 6 ??? 0x00007ffee46c1820 0x0 + 140732730710048
[bt] (4) 5 _ctypes.cpython-37m-darwin.so 0x000000011650436f ffi_call_unix64 + 79
[bt] (3) 4 libtvm.dylib 0x0000000117925266 TVMFuncCall + 70
[bt] (2) 3 libtvm.dylib 0x00000001172ff0b5 std::__1::__function::__func<tvm::$_5, std::__1::allocator<tvm::$_5>, void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 181
[bt] (1) 2 libtvm.dylib 0x00000001172fcde7 tvm::GenericFunc::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const + 743
[bt] (0) 1 libtvm.dylib 0x0000000117926e25 std::__1::__function::__func<TVMFuncCreateFromCFunc::$_2, std::__1::allocator<TVMFuncCreateFromCFunc::$_2>, void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 213
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py", line 78, in cfun
rv = local_pyfunc(*pyargs)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/relay/op/strategy/cuda.py", line 313, in dense_strategy_cuda
if nvcc.have_tensorcore(tvm.gpu(0).compute_version):
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/_ffi/runtime_ctypes.py", line 218, in compute_version
self.device_type, self.device_id, 4)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/_ffi/runtime_ctypes.py", line 180, in _GetDeviceAttr
device_type, device_id, attr_id)
File "/Users/sam/dev/github/tvm/build/numberwang/venv/lib/python3.7/site-packages/tvm-0.7.dev1-py3.7-macosx-10.13-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py", line 213, in __call__
raise get_last_ffi_error()
[bt] (6) 7 ??? 0x00007ffee46c0130 0x0 + 140732730704176
[bt] (5) 6 _ctypes.cpython-37m-darwin.so 0x000000011650436f ffi_call_unix64 + 79
[bt] (4) 5 libtvm.dylib 0x0000000117925266 TVMFuncCall + 70
[bt] (3) 4 libtvm.dylib 0x0000000117927340 std::__1::__function::__func<$_4, std::__1::allocator<$_4>, void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 400
[bt] (2) 3 libtvm.dylib 0x00000001179264a4 tvm::runtime::DeviceAPIManager::GetAPI(int, bool) + 532
[bt] (1) 2 libtvm.dylib 0x0000000117926735 tvm::runtime::DeviceAPIManager::GetAPI(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool) + 421
[bt] (0) 1 libtvm.dylib 0x0000000116ee0909 dmlc::LogMessageFatal::~LogMessageFatal() + 57
File "/Users/sam/dev/github/tvm/src/runtime/c_runtime_api.cc", line 133
TVMError: Check failed: allow_missing: Device API gpu is not enabled.
Looks like this might have fixed it:
That helped with some linking warnings but the benchmark doesn’t work
I am thinking about fully reimagining my machine to rule out a flaky OS XCode
Sam