Issues with Autotuning a Convolutional Network for VTA Simulator

AnandGokhale · February 27, 2020, 1:39pm

I was trying to tune a network for the simulator that VTA has as a part of its github repo. I have followed the instructions on this doc (as of Feb 27th 2020). The only change I made was attempting to use a simulator instead of pynq, I used the following command:

python -m tvm.exec.rpc_server --tracker=[HOST_IP]:9190 --key=sim

My error log is :

Extract tasks...
Extracted 10 conv2d tasks:
(1, 14, 14, 256, 512, 1, 1, 0, 0, 2, 2)
(1, 28, 28, 128, 256, 1, 1, 0, 0, 2, 2)
(1, 56, 56, 64, 128, 1, 1, 0, 0, 2, 2)
(1, 56, 56, 64, 64, 3, 3, 1, 1, 1, 1)
(1, 28, 28, 128, 128, 3, 3, 1, 1, 1, 1)
(1, 56, 56, 64, 128, 3, 3, 1, 1, 2, 2)
(1, 14, 14, 256, 256, 3, 3, 1, 1, 1, 1)
(1, 28, 28, 128, 256, 3, 3, 1, 1, 2, 2)
(1, 7, 7, 512, 512, 3, 3, 1, 1, 1, 1)
(1, 14, 14, 256, 512, 3, 3, 1, 1, 2, 2)
Tuning...
[Task  1/10]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/480) | 0.00 sTraceback (most recent call last):

  File "conv_autotune.py", line 435, in <module>
    tune_and_evaluate(tuning_option)

  File "conv_autotune.py", line 388, in tune_and_evaluate
    tune_tasks(tasks, **tuning_opt)

  File "conv_autotune.py", line 286, in tune_tasks
    autotvm.callback.log_to_file(tmp_log_file)

  File "/home/anand/workspace/Compiler_Project/TVM/incubator-tvm/python/tvm/autotvm/tuner/tuner.py", line 108, in tune
    measure_batch = create_measure_batch(self.task, measure_option)

  File "/home/anand/workspace/Compiler_Project/TVM/incubator-tvm/python/tvm/autotvm/measure/measure.py", line 253, in create_measure_batch
    attach_objects = runner.set_task(task)

  File "/home/anand/workspace/Compiler_Project/TVM/incubator-tvm/python/tvm/autotvm/measure/measure_methods.py", line 215, in set_task
    raise RuntimeError("Cannot get remote devices from the tracker. "

RuntimeError: Cannot get remote devices from the tracker. Please check the status of tracker by 'python -m tvm.exec.query_rpc_tracker --port [THE PORT YOU USE]' and make sure you have free devices on the queue status.

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/anand/anaconda3/envs/mxnet/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/home/anand/anaconda3/envs/mxnet/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/anand/workspace/Compiler_Project/TVM/incubator-tvm/python/tvm/autotvm/measure/measure_methods.py", line 580, in _check
    while not ctx.exist:  # wait until we get an available device
  File "/home/anand/workspace/Compiler_Project/TVM/incubator-tvm/python/tvm/_ffi/runtime_ctypes.py", line 186, in exist
    self.device_type, self.device_id, 0) != 0
  File "/home/anand/workspace/Compiler_Project/TVM/incubator-tvm/python/tvm/_ffi/runtime_ctypes.py", line 180, in _GetDeviceAttr
    device_type, device_id, attr_id)
  File "/home/anand/workspace/Compiler_Project/TVM/incubator-tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 213, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (7) /home/anand/workspace/Compiler_Project/TVM/incubator-tvm/build/libtvm.so(TVMFuncCall+0x65) [0x7f4b70847065]
  [bt] (6) /home/anand/workspace/Compiler_Project/TVM/incubator-tvm/build/libtvm.so(+0x134e121) [0x7f4b70845121]
  [bt] (5) /home/anand/workspace/Compiler_Project/TVM/incubator-tvm/build/libtvm.so(tvm::runtime::RPCDeviceAPI::GetAttr(DLContext, tvm::runtime::DeviceAttrKind, tvm::runtime::TVMRetValue*)+0x144) [0x7f4b7089f4e4]
  [bt] (4) /home/anand/workspace/Compiler_Project/TVM/incubator-tvm/build/libtvm.so(+0x13ba530) [0x7f4b708b1530]
  [bt] (3) /home/anand/workspace/Compiler_Project/TVM/incubator-tvm/build/libtvm.so(tvm::runtime::RPCSession::HandleUntilReturnEvent(tvm::runtime::TVMRetValue*, bool, tvm::runtime::PackedFunc const*)+0x105) [0x7f4b708b1385]
  [bt] (2) /home/anand/workspace/Compiler_Project/TVM/incubator-tvm/build/libtvm.so(+0x13c800c) [0x7f4b708bf00c]
  [bt] (1) /home/anand/workspace/Compiler_Project/TVM/incubator-tvm/build/libtvm.so(tvm::support::Socket::Error(char const*)+0x9e) [0x7f4b708b398e]
  [bt] (0) /home/anand/workspace/Compiler_Project/TVM/incubator-tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x43) [0x7f4b7001d1a3]
  File "/home/anand/workspace/Compiler_Project/TVM/incubator-tvm/src/runtime/rpc/../../support/socket.h", line 362
TVMError: Socket SockChannel::Recv Error:Connection reset by peer

My RPC Tracker looks like :

Tracker address 0.0.0.0:9190

    Server List
    ----------------------------
    server-address	key
    ----------------------------
    127.0.0.1:41052	server:sim
    127.0.0.1:41054	server:sim
    ----------------------------

    Queue Status
    ---------------------------
    key   total  free  pending
    ---------------------------
    sim   2      1     0      
    ---------------------------

While running the program. How do I go about autotuning the network for the simulator, Has anyone else faced similar issues ? Please note that the same issue occurs even if I use only 1 instance of sim.

surya00060 · March 2, 2020, 10:02am

@thierry @tqchen Could you show a workaround here?

suvadeep · April 21, 2020, 9:00pm

Hi @thierry - Can we get an autoTVM run working with sim target running on our local machine, instead of running on an external target such as Pynq board?

suvadeep · May 13, 2020, 6:39pm

Hi @thierry - Any updates or advice on this?

keai007 · November 1, 2020, 12:22pm

Same issue, anyone can help？

yuxguo · November 2, 2020, 7:25am

The same issue, I’m trying to tune some conv2d_packed workloads on vta fast simulator and test the speed of these ops. What should I do? How to set the target of the autotvm task and the runner of measure_option?