[VTA] Workaround for Autotuning with One PYNQ Z1 Board

@hht Thank you for your answer. I used your code except uncomment the return so that I can tune. It failed after the tune is done at lib.save(temp.relpath("graphlib.o"))

    ...
     # We do not run the tuning in our webpage server since it takes too long.
     # Comment the following line to run it by yourself.
     return
 
     # run tuning tasks
     print("Tuning...")
     tune_tasks(tasks, **tuning_opt)
 
     # evaluate with tuning history
     if env.TARGET != "sim":
         # Get remote from fleet node
         remote = autotvm.measure.request_remote(
             env.TARGET, tracker_host, tracker_port, timeout=10000
         )
         # Reconfigure the JIT runtime and FPGA.
         vta.reconfig_runtime(remote)
         vta.program_fpga(remote, bitstream=None)
     else:
         # In simulation mode, host the RPC server locally.
         remote = rpc.LocalSession()
 
     # compile kernels with history best records
     with autotvm.tophub.context(target, extra_files=[log_file]):
         # Compile network
         print("Compile...")
         if target.device_name != "vta":
             with tvm.transform.PassContext(opt_level=3, disabled_pass={"AlterOpLayout"}):
                 lib = relay.build(
                     relay_prog, target=target, params=params, target_host=env.target_host
                 )
         else:
             with vta.build_config(opt_level=3, disabled_pass={"AlterOpLayout"}):
                 lib = relay.build(
                     relay_prog, target=target, params=params, target_host=env.target_host
                 )
 
         # Export library
         print("Upload...")
         temp = utils.tempdir()
         lib.save(temp.relpath("graphlib.o")). <<<<< failed it here
         remote.upload(temp.relpath("graphlib.o"))
         lib = remote.load_module("graphlib.o")
 
         # Generate the graph runtime
         ctx = remote.ext_dev(0) if device == "vta" else remote.cpu(0)
         m = graph_runtime.GraphModule(lib["default"](ctx))
 
         # upload parameters to device
         image = tvm.nd.array((np.random.uniform(size=(1, 3, 224, 224))).astype("float32"))
         m.set_input("data", image)
 
         # evaluate
         print("Evaluate inference time cost...")
         timer = m.module.time_evaluator("run", ctx, number=1, repeat=10)
         tcost = timer()
         prof_res = np.array(tcost.results) * 1000  # convert to millisecond
         print(
             "Mean inference time (std dev): %.2f ms (%.2f ms)"
             % (np.mean(prof_res), np.std(prof_res))
         )

Thought?

Thank you very much,