Hi, i’m auto-tuning an inception-v3 model to compare the performance for nvidia gpu vs tensorflow go, the version of tf i’m using is 1.4.1 in a ubuntu16.04 docker image( docker pull tensorflow/tensorflow:1.4.1-devel-gpu-py3 specifically), with cuda 8.0, cudnn 6, gcc 5.4.0 and llvm 6,
i download the latest tvm and follow the installation guide, the py from tutorials goes fine except some warning messages like “WARNING:autotvm:Cannot find config for target=cuda …”, so i take a try of anto-tuning with target = tvm.target.create(‘cuda -libs=cudnn -model=p40’), then i get this:
[09:47:40] /usr/tvm/src/contrib/cudnn/conv_forward.cc:243: CUDNN Found 8 fwd algorithms, choosing CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM
[09:47:40] /usr/tvm/src/contrib/cudnn/conv_forward.cc:246: 0) CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM - time: 0.270336 ms, Memory: 0
[09:47:40] /usr/tvm/src/contrib/cudnn/conv_forward.cc:246: 1) CUDNN_CONVOLUTION_FWD_ALGO_GEMM - time: 0.272384 ms, Memory: 524288
[09:47:40] /usr/tvm/src/contrib/cudnn/conv_forward.cc:246: 2) CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM - time: 0.344064 ms, Memory: 392
[09:47:40] /usr/tvm/src/contrib/cudnn/conv_forward.cc:246: 3) CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING - time: 0.847872 ms, Memory: 55914912
[09:47:40] /usr/tvm/src/contrib/cudnn/conv_forward.cc:246: 4) CUDNN_CONVOLUTION_FWD_ALGO_FFT - time: 22.5925 ms, Memory: 912261120
[09:47:40] /usr/tvm/src/contrib/cudnn/conv_forward.cc:246: 5) CUDNN_CONVOLUTION_FWD_ALGO_DIRECT - time: -1 ms, Memory: 0
[09:47:40] /usr/tvm/src/contrib/cudnn/conv_forward.cc:246: 6) CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD - time: -1 ms, Memory: 0
[09:47:40] /usr/tvm/src/contrib/cudnn/conv_forward.cc:246: 7) CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED - time: -1 ms, Memory: 0
[Task 1/43] Current/Best: 0.00/ 0.00 GFLOPS | Progress: (1/10) | 0.95 s Done.
multiprocessing.pool.RemoteTraceback:
“”"
Traceback (most recent call last):
File “/usr/lib/python3.5/multiprocessing/pool.py”, line 119, in worker
result = (True, func(*args, **kwds))
File “/usr/lib/python3.5/multiprocessing/pool.py”, line 44, in mapstar
return list(map(*args))
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/autotvm/tuner/xgboost_cost_model.py”, line 326, in _extract_itervar_feature_log
sch, args = inp.task.instantiate(config)
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/autotvm/task/task.py”, line 65, in instantiate
sch, arg_bufs = self.func(*self.args, **self.kwargs)
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/autotvm/task/topi_integration.py”, line 133, in _topi_nn_conv2d
C = topi.nn.conv2d(*args, **kwargs)
File “”, line 2, in conv2d
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/target.py”, line 356, in dispatch_func
return dispatch_dict[k](*args, **kwargs)
File “”, line 2, in config_dispatcher
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/autotvm/task/dispatcher.py”, line 204, in dispatch_func
return dispatch_dict[cfg.template_key](cfg, *args, **kwargs)
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/autotvm/task/topi_integration.py”, line 267, in template_call
node = f(cfg, *args, **kwargs)
File “/usr/local/lib/python3.5/dist-packages/topi-0.5.dev0-py3.5.egg/topi/cuda/conv2d.py”, line 86, in conv2d_cuda
algo=-1) # let CUDNN choose the best algo
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/contrib/cudnn.py”, line 353, in conv2d_forward
oshape)
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/contrib/cudnn.py”, line 284, in conv2d_find_algo
int(y_shape[3]))
File “tvm/_ffi/_cython/./function.pxi”, line 286, in tvm._ffi._cy3.core.FunctionBase.call
File “tvm/_ffi/_cython/./function.pxi”, line 231, in tvm._ffi._cy3.core.FuncCall
File “tvm/_ffi/_cython/./base.pxi”, line 151, in tvm._ffi._cy3.core.CALL
tvm._ffi.base.TVMError: [09:47:44] /usr/tvm/src/contrib/cudnn/conv_forward.cc:229: Check failed: e == CUDNN_STATUS_SUCCESS (2 vs. 0) cuDNN: CUDNN_STATUS_ALLOC_FAILED
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x73261d) [0x7f7029a6d61d]
[bt] (1) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0xec1280) [0x7f702a1fc280]
[bt] (2) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0xec1f74) [0x7f702a1fcf74]
[bt] (3) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(TVMFuncCall+0x5e) [0x7f702a17da8e]
[bt] (4) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/_ffi/_cy3/core.cpython-35m-x86_64-linux-gnu.so(+0x1862d) [0x7f6fa60de62d]
[bt] (5) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/_ffi/_cy3/core.cpython-35m-x86_64-linux-gnu.so(+0x18d1b) [0x7f6fa60ded1b]
[bt] (6) python3(PyObject_Call+0x47) [0x5c1797]
[bt] (7) python3(PyEval_EvalFrameEx+0x4ec6) [0x53bba6]
[bt] (8) python3(PyEval_EvalFrameEx+0x4b04) [0x53b7e4]
[bt] (9) python3() [0x5406df]
“”"
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “tune_relay_cuda.py”, line 242, in
tune_and_evaluate(tuning_option)
File “tune_relay_cuda.py”, line 211, in tune_and_evaluate
tune_tasks(tasks, **tuning_opt)
File “tune_relay_cuda.py”, line 184, in tune_tasks
tuner_obj.load_history(autotvm.record.load_from_file(tmp_log_file))
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/autotvm/tuner/model_based_tuner.py”, line 272, in load_history
success = base_model.fit_log(data_set, self.plan_size)
File “/usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/autotvm/tuner/xgboost_cost_model.py”, line 223, in fit_log
res = pool.map(feature_extract_func, data)
File “/usr/lib/python3.5/multiprocessing/pool.py”, line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File “/usr/lib/python3.5/multiprocessing/pool.py”, line 608, in get
raise self._value
tvm._ffi.base.TVMError: [09:47:44] /usr/tvm/src/contrib/cudnn/conv_forward.cc:229: Check failed: e == CUDNN_STATUS_SUCCESS (2 vs. 0) cuDNN: CUDNN_STATUS_ALLOC_FAILED
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x73261d) [0x7f7029a6d61d]
[bt] (1) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0xec1280) [0x7f702a1fc280]
[bt] (2) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0xec1f74) [0x7f702a1fcf74]
[bt] (3) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(TVMFuncCall+0x5e) [0x7f702a17da8e]
[bt] (4) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/_ffi/_cy3/core.cpython-35m-x86_64-linux-gnu.so(+0x1862d) [0x7f6fa60de62d]
[bt] (5) /usr/local/lib/python3.5/dist-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/_ffi/_cy3/core.cpython-35m-x86_64-linux-gnu.so(+0x18d1b) [0x7f6fa60ded1b]
[bt] (6) python3(PyObject_Call+0x47) [0x5c1797]
[bt] (7) python3(PyEval_EvalFrameEx+0x4ec6) [0x53bba6]
[bt] (8) python3(PyEval_EvalFrameEx+0x4b04) [0x53b7e4]
[bt] (9) python3() [0x5406df]
any advice?