[Opencl][Adreno] A Question about "-device=adreno"

Why does the following phenomenon occur after adding “-device=Adreno”: although the run time has become shorter, has the warmup (first run) time been longer?

  • target = “opencl”:
    warmup time used: 1450 ms
    run time used: 388ms
    run time used: 400ms
    run time used: 393ms
    
  • target = “opencl -device=adreno”:
    warmup time used: 7726 ms
    run time used: 61ms
    run time used: 75ms
    run time used: 74ms
    

One phenomenon I observed is that when “-device=Adreno” is not added, the layout of the model becomes NCHW, and after adding it, it becomes NCHW4c. This can explain why the run time is faster, but why is the warmup time longer?

I have just been exposed to this topic, and I am very grateful for everyone’s help.