GluonCV SSD benchmark run time on rk3399

Hi, I deploy GluonCV on rk3399 CPU and mali GPU for test. I already auto-tune the model. I thought GPU may run faster, but actually GPU run slower than CPU. For 512x512 image, CPU: 6500ms GPU: 9800ms I find a benchmark test for resnet model, and GPU is faster than CPU. I try it and get result same to wiki.

So why GluonCV SSD GPU is slower. BTW, if someone can give me some run time data on rk3399 or other mali GPU, thank you!

refer to https://tvm.ai/2018/10/03/auto-opt-all.html