Figured it out.
Instead of save
, which is likely due to the API changes over time, export_library
works
So here is the change and it works.
# Export library
print("Upload...")
#temp = utils.tempdir()
#lib.save(temp.relpath("graphlib.o"))
#remote.upload(temp.relpath("graphlib.o"))
#lib = remote.load_module("graphlib.o")
# Send the inference library over to the remote RPC server
temp = utils.tempdir()
lib.export_library(temp.relpath("graphlib.tar"))
remote.upload(temp.relpath("graphlib.tar"))
lib = remote.load_module("graphlib.tar")
Extract tasks...
Extracted 10 conv2d tasks:
(1, 14, 14, 256, 512, 1, 1, 0, 0, 2, 2)
(1, 28, 28, 128, 256, 1, 1, 0, 0, 2, 2)
(1, 56, 56, 64, 128, 1, 1, 0, 0, 2, 2)
(1, 56, 56, 64, 64, 3, 3, 1, 1, 1, 1)
(1, 28, 28, 128, 128, 3, 3, 1, 1, 1, 1)
(1, 56, 56, 64, 128, 3, 3, 1, 1, 2, 2)
(1, 14, 14, 256, 256, 3, 3, 1, 1, 1, 1)
(1, 28, 28, 128, 256, 3, 3, 1, 1, 2, 2)
(1, 7, 7, 512, 512, 3, 3, 1, 1, 1, 1)
(1, 14, 14, 256, 512, 3, 3, 1, 1, 2, 2)
Tuning...
[Task 1/10] Current/Best: 0.00/ 0.00 GFLOPS | Progress: (10/10) | 1.87 sWARNING:root:Could not find any valid schedule for task Task(func_name=conv2d_packed.vta, args=(('TENSOR', (1, 16, 14, 14, 1, 16), 'int8'), ('TENSOR', (32, 16, 3, 3, 16, 16), 'int8'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW1n16c', 'int32'), kwargs={}, workload=('conv2d_packed.vta', ('TENSOR', (1, 16, 14, 14, 1, 16), 'int8'), ('TENSOR', (32, 16, 3, 3, 16, 16), 'int8'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW1n16c', 'int32')). A file containing the errors has been written to /tmp/tvm_tuning_errors_ys3zu_2v.log.
INFO:autotvm:Get devices for measurement successfully!
[Task 2/10] Current/Best: 0.00/ 70.42 GFLOPS | Progress: (10/10) | 6.26 sINFO:autotvm:Get devices for measurement successfully!
[Task 3/10] Current/Best: 0.00/ 0.00 GFLOPS | Progress: (10/10) | 4.79 sWARNING:root:Could not find any valid schedule for task Task(func_name=conv2d_packed.vta, args=(('TENSOR', (1, 8, 28, 28, 1, 16), 'int8'), ('TENSOR', (16, 8, 3, 3, 16, 16), 'int8'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW1n16c', 'int32'), kwargs={}, workload=('conv2d_packed.vta', ('TENSOR', (1, 8, 28, 28, 1, 16), 'int8'), ('TENSOR', (16, 8, 3, 3, 16, 16), 'int8'), (2, 2), (1, 1, 1, 1), (1, 1), 'NCHW1n16c', 'int32')). A file containing the errors has been written to /tmp/tvm_tuning_errors_fqpg5ysx.log.
INFO:autotvm:Get devices for measurement successfully!
[Task 4/10] Current/Best: 31.80/ 31.80 GFLOPS | Progress: (10/10) | 8.19 sINFO:autotvm:Get devices for measurement successfully!
[Task 5/10] Current/Best: 0.00/ 25.89 GFLOPS | Progress: (10/10) | 5.99 sINFO:autotvm:Get devices for measurement successfully!
[Task 6/10] Current/Best: 0.00/ 72.11 GFLOPS | Progress: (10/10) | 6.80 sINFO:autotvm:Get devices for measurement successfully!
[Task 7/10] Current/Best: 0.00/ 19.19 GFLOPS | Progress: (10/10) | 5.65 sINFO:autotvm:Get devices for measurement successfully!
[Task 8/10] Current/Best: 0.00/ 5.28 GFLOPS | Progress: (10/10) | 7.45 sINFO:autotvm:Get devices for measurement successfully!
[Task 9/10] Current/Best: 1.21/ 5.83 GFLOPS | Progress: (10/10) | 14.37 sINFO:autotvm:Get devices for measurement successfully!
[Task 10/10] Current/Best: 0.00/ 6.53 GFLOPS | Progress: (10/10) | 4.38 sINFO:autotvm:Extract 10 best records from the vta.resnet18_v1.log.tmp
Compile...
Upload...
Evaluate inference time cost...
Mean inference time (std dev): 69.65 ms (2.26 ms)