How to measure the RK3399 running time

daming5432 · April 18, 2018, 9:23am

I follow this tutorial: http://nnvm.tvmlang.org/tutorials/from_coreml.html#sphx-glr-tutorials-from-coreml-py. I use the measure time code like this:
start=time.time()
module.run()
finish=time.time()
print (finish-start)
when I set “target=tvm.target.mali()”, it have the warning “src/codegen/build_opencl.cc:34: OpenCL runtime not enabled, return a source module…”,the running time is 20 second.
when set "target=“llvm” the running time is 3 second.
Is it too long? How can I get the real time.

eqy · April 19, 2018, 12:49am

When building tvm are you compiling with OpenCL support? (specified in config.mk)

daming5432 · April 19, 2018, 2:04am

I building tvm in server without OpenCL support and in the device with the OpenCL support.

eqy · April 19, 2018, 2:20am

My understanding is that you need to compile with OpenCL support on the server side, even if the kernel itself is built from source at runtime.

daming5432 · April 19, 2018, 2:51am

Should I let theLLVM_CONFIG = /home/daming/software/clang+llvm-6.0.0-aarch64-linux-gnu/bin/llvm-config?
I let LLVM-CONFIG = /home/daming/software/clang+llvm-5.0.1-x86_64-linux-gnu-ubuntu-16.04/bin now.

eqy · April 19, 2018, 3:04am

Either 6.0.0 or 5.0.1 should work; I would just check the path is correct e.g., it points to a binary eventually.

daming5432 · April 19, 2018, 4:01am

I mean that I shoud let LLVM-CONFIG = arrch64 or = x86_64 in the server?

eqy · April 19, 2018, 4:11am

You should compile with the target host for the RPC device. If the RPC device is the RK3999 board it should be llvm -target=aarch64-linux-gnu -mattr=+neon. See @merrymercy’s repo for more details:

merrymercy · April 19, 2018, 5:33am

@daming5432’s question is how to measure the time cost.

TVM provides a time_evaluator function to measure time cost which excludes the cost of compilation, rpc and first dry run.

You can see my benchmark script for examples

github.com

merrymercy/tvm-mali/blob/3a116f4ec374dd541298a28822a8961ea4fc02f9/mali_imagenet_bench.py#L77-L78


ftimer = module.module.time_evaluator("run", ctx, num_test)
prof_res = ftimer()

When doing benchmark, we should always do some warmup runs. For opencl, the first run will compile kernel code on the board, which takes a lot of time.

daming5432 · April 19, 2018, 6:58am

Thank you. I have the right measure time.
But When I run your examle of "merrymercy/tvm-mali/blob/3a116f4ec374dd541298a28822a8961ea4fc02f9/mali_imagenet_bench.py#L77-L78 "
The command :python mali_imagenet_bench.py --target-host ‘llvm -target=aarch64-linux-gnu -mattr=+neon’ --host 192.168.1.157 --port 9091 --model all

and error:

[14:50:35] src/codegen/build_opencl.cc:34: OpenCL runtime not enabled, return a source module…
Traceback (most recent call last):
File “mali_imagenet_bench.py”, line 107, in
run_case(model, dtype)
File “mali_imagenet_bench.py”, line 47, in run_case
rparams = {k: tvm.nd.array(v, ctx) for k, v in params.items()}
File “mali_imagenet_bench.py”, line 47, in
rparams = {k: tvm.nd.array(v, ctx) for k, v in params.items()}
File “/home/daming/.local/lib/python2.7/site-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/ndarray.py”, line 199, in array
return empty(arr.shape, arr.dtype, ctx).copyfrom(arr)
File “/home/daming/.local/lib/python2.7/site-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/_ffi/ndarray.py”, line 165, in copyfrom
source_array.copyto(self)
File “/home/daming/.local/lib/python2.7/site-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/_ffi/ndarray.py”, line 232, in copyto
self.handle, target.handle, None))
File “/home/daming/.local/lib/python2.7/site-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/_ffi/base.py”, line 66, in check_call
raise TVMError(py_str(_LIB.TVMGetLastError()))
tvm._ffi.base.TVMError: [14:50:50] src/runtime/rpc/…/…/common/socket.h:259: Socket SockChannel::Send Error:Connection reset by peer

merrymercy · April 19, 2018, 9:17am

There are some breaking changes in the RPC protocol recently. You should update the tvm version on your board and host to the latest one.

Edit:
Or do you have more than 2.5 GB memory on your board? vgg16 needs at least 2.5 GB memory to run on my board. If you don’t, you can skip vgg16 by replace --model all with --model mobilenet or --model resnet18

daming5432 · April 19, 2018, 10:15am

Thank you very much! I test resnet18 mobilenet. It is ok.
I see your “mali_imagenet_bench.py”.you load model like this:
elif model == ‘resnet18’:
net, params = nnvm.testing.resnet.get_workload(num_layers=18,
batch_size=1, image_shape=image_shape, dtype=dtype)
Do you use mxnet model?
I don’t have mxnet in my RK3399 device.How can I load mxnet model?

merrymercy · April 19, 2018, 3:10pm

nnvm.testing.resnet.get_workload uses random weights, it is only used for testing

If you want to use mxnet model with pretrained weights, you can follow this tutorial
NNVM supports download pretrained model from gluon model zoo

wk738126046 · December 7, 2018, 9:27am

Excuse me ，what’s the cost when you measure the RK3399 running time ?
thanks