[AutoTVM] Cannot get remote devices from tracker

Hello, I’m trying to autotune my CNN to run on PYNQ FPGA using this tutorial.

I get the following error when I run the code:

File "autotune.py", line 256, in <module>
    tune_and_evaluate(tuning_option)
  File "autotune.py", line 214, in tune_and_evaluate
    tune_tasks(tasks, **tuning_opt)
  File "autotune.py", line 138, in tune_tasks
    autotvm.callback.log_to_file(tmp_log_file)])
  File "/home/youn/.local/lib/python3.6/site-packages/tvm-0.6.dev0-py3.6-linux-x86_64.egg/tvm/autotvm/tuner/tuner.py", line 108, in tune
    measure_batch = create_measure_batch(self.task, measure_option)
  File "/home/youn/.local/lib/python3.6/site-packages/tvm-0.6.dev0-py3.6-linux-x86_64.egg/tvm/autotvm/measure/measure.py", line 252, in create_measure_batch
    attach_objects = runner.set_task(task)
  File "/home/youn/.local/lib/python3.6/site-packages/tvm-0.6.dev0-py3.6-linux-x86_64.egg/tvm/autotvm/measure/measure_methods.py", line 211, in set_task
    raise RuntimeError("Cannot get remote devices from the tracker. "
RuntimeError: Cannot get remote devices from the tracker. Please check the status of tracker by 'python -m tvm.exec.query_rpc_tracker --port [THE PORT YOU USE]' and make sure you have free devices on the queue status.
^CException ignored in: <module 'threading' from '/usr/lib/python3.6/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 1294, in _shutdown
    t.join()
  File "/usr/lib/python3.6/threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):

I’m pretty sure my FPGA is registered, for when I run

python3 -m tvm.exec.query_rpc_tracker --port 9190

I get

Tracker address localhost:9190
Server List
----------------------------
server-address  key
----------------------------
192.168.1.15:51734      server:pynq
----------------------------

Queue Status
----------------------------
key    total  free  pending
----------------------------
pynq   1      1     0
----------------------------

Just as a note, on the FPGA side I run

python3 -m vta.exec.rpc_server --tracker=192.168.1.4:9190 --key=pynq.

while the tutorial says I should run

python3 -m tvm.exec.rpc_server --tracker=192.168.1.4:9190 --key=pynq

The latter gives me this error. You should update the tutorial if it is outdated. Otherwise, any help on figuring out what’s wrong would be appreciated. Thanks!

2 Likes

Hello youn123, I’m running into the same issue and noticed that no one ever responded to your question. Therefore I was wondering if you ever figured out the issue and if so, if you would mind explaining the solution to me.

3 Likes

Hi VTA users, @thierry

I am running in the similar issue of running the example. My setup is as follows and runs fine with other VTA code.

  • ultra96
  • pynq version 2.5
  • tvm 5697440cc178e3d901302bc879c1eb506eff8064
  • vta 87ce9acfae550d1a487746e9d06c2e250076e54c
  • host: ubuntu 18.4 (as a side I also tried with Mac, it failed some other fault even it got to this much, and seemingly related to the OS, so I tried ubuntu which went further, close but no cigar).

When I start the tracker I see the devices shown as free and then taken by the autotvm script then showed error. In other words, initial rpc tracker connection was correct.

Also I did changed the check_remote timeout based on this [link](# [solved][AutoTVM] Cannot get remote devices from the tracker), but it didn’t make a difference.

$ python3 ./tune_relay_vta.py 
Extract tasks...
Extracted 10 conv2d tasks:
(1, 14, 14, 256, 512, 1, 1, 0, 0, 2, 2)
(1, 28, 28, 128, 256, 1, 1, 0, 0, 2, 2)
(1, 56, 56, 64, 128, 1, 1, 0, 0, 2, 2)
(1, 56, 56, 64, 64, 3, 3, 1, 1, 1, 1)
(1, 28, 28, 128, 128, 3, 3, 1, 1, 1, 1)
(1, 56, 56, 64, 128, 3, 3, 1, 1, 2, 2)
(1, 14, 14, 256, 256, 3, 3, 1, 1, 1, 1)
(1, 28, 28, 128, 256, 3, 3, 1, 1, 2, 2)
(1, 7, 7, 512, 512, 3, 3, 1, 1, 1, 1)
(1, 14, 14, 256, 512, 3, 3, 1, 1, 2, 2)
Tuning...
Traceback (most recent call last):
  File "./tune_relay_vta.py", line 476, in <module>
    tune_and_evaluate(tuning_option)
  File "./tune_relay_vta.py", line 432, in tune_and_evaluate
    tune_tasks(tasks, **tuning_opt)
  File "./tune_relay_vta.py", line 304, in tune_tasks
    autotvm.callback.log_to_file(tmp_log_file),
  File "//Github/tvm_linux/python/tvm/autotvm/tuner/tuner.py", line 112, in tune
    measure_batch = create_measure_batch(self.task, measure_option)
  File "/Github/tvm_linux/python/tvm/autotvm/measure/measure.py", line 257, in create_measure_batch
    attach_objects = runner.set_task(task)
  File "/Github/tvm_linux/python/tvm/autotvm/measure/measure_methods.py", line 252, in set_task
    "Cannot get remote devices from the tracker. "
RuntimeError: Cannot get remote devices from the tracker. Please check the status of tracker by 'python -m tvm.exec.query_rpc_tracker --port [THE PORT YOU USE]' and make sure you have free devices on the queue status.


^CException ignored in: <module 'threading' from '/usr/lib/python3.6/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 1294, in _shutdown
    t.join()
  File "/usr/lib/python3.6/threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt

Seems like this is the similar discussion.

I would appreciate any tips and helps.

Thank you very much,

All,

Dig more into the discussion, and I found the post by @hht (Thank you very much!) has a fix for this. So the issue was that remote was occupying the connection already so autotvm cannot connect the VTA device, and this issue was described in here.

Just to help out others, here is the file from @hht 's fork and diff

 $ diff -uN tune_relay_vta.py tune_relay_vta_one_board.py 
--- tune_relay_vta.py   2020-10-31 01:26:48.000000000 -0700
+++ tune_relay_vta_one_board.py 2020-11-23 01:19:07.000000000 -0800
@@ -70,6 +70,9 @@
 from vta.testing import simulator
 from vta.top import graph_pack
 
 #
 # You can register multiple devices to the tracker to accelerate tuning.
@@ -340,18 +342,6 @@
 
 def tune_and_evaluate(tuning_opt):
 
-    if env.TARGET != "sim":
-        # Get remote from fleet node
-        remote = autotvm.measure.request_remote(
-            env.TARGET, tracker_host, tracker_port, timeout=10000
-        )
-        # Reconfigure the JIT runtime and FPGA.
-        vta.reconfig_runtime(remote)
-        vta.program_fpga(remote, bitstream=None)
-    else:
-        # In simulation mode, host the RPC server locally.
-        remote = rpc.LocalSession()
-
     # Register VTA tuning tasks
     register_vta_tuning_tasks()
 
@@ -401,12 +391,25 @@
 
 
     # run tuning tasks
     print("Tuning...")
     tune_tasks(tasks, **tuning_opt)
 
+    # evaluate with tuning history
+    if env.TARGET != "sim":
+        # Get remote from fleet node
+        remote = autotvm.measure.request_remote(
+            env.TARGET, tracker_host, tracker_port, timeout=10000
+        )
+        # Reconfigure the JIT runtime and FPGA.
+        vta.reconfig_runtime(remote)
+        vta.program_fpga(remote, bitstream=None)
+    else:
+        # In simulation mode, host the RPC server locally.
+        remote = rpc.LocalSession()
+
     # compile kernels with history best records
     with autotvm.tophub.context(target, extra_files=[log_file]):
         # Compile network

1 Like

@isong Hello, I use the new code now, but I still meet the same problem.
See here: [AutoTVM] I connected the PYNQ, but cannot get remote devices from the tracker .
I don’t know what to do.
Please give me some advice if you have time.