Unable to reproduce benchmark results on RK3399 Mali-T860

When I reproduce the benchmark results on the following url, I got well performance on Arm-cpu, but the results on Mali-gpu is terrible.
https://github.com/dmlc/tvm/wiki/Benchmark#mobile-gpu

Example results on Mali-T860:

Screenshot%20from%202019-04-24%2000-18-04

Which script are you using for benchmarking? Also, are you using the GPU for other tasks, such as running the desktop environment?

I run the following code:
~/tvm/apps/benchmark$ python3 mobile_gpu_imagenet_bench.py --model rk3399 --rpc-key rk3399

I close the other tasks and stop the destop of rk3399 by
sudo /etc/init.d/lightdm stop
sudo -i
echo performance > /sys/class/misc/mali0/device/devfreq/ff9a0000.gpu/governor

and rerun the above codes, no major changes in the results.

The following figure is my reproduce results VS. benchmark.
Device: rk3399, Mali-T860
float32

Thanks for the detailed information; I will check on my RK3399 boards.

Tracked by this issue https://github.com/dmlc/tvm/issues/3088

I use another mali gpu.

when I am tuning the model, some layers show no Latency and Speed info.

The tuing result is verry terrible.

The tvm_runtime was built by cross compiling not on the device platform.

Could you give me some ideas about this problem?