[AutoTuning] How to debug a autotvm failure?

One additional data point: I ran this on a different GPU configuration, using the identical docker image and script. This time:

  1. Task 3/13 was properly tuned and didn’t result in going into debug mode.
  2. Task 6/13 showed similar initial symptoms, progress checkpoints saying 0.00/0.00 GFLOPS followed by switching to debug mode for the rest of the run.
  3. Debug mode continued to end of the run and I received a set of tuning results.

At this point, I assume there are two different issues described here:

  1. The conv2d schedule I am using for topi_nn_conv2d is invalid for configuration 3/13 on the hardware - so I need to look at the definitions of those schedules. Switching to debug mode is a symptom of those invalid schedules. Next steps is looking further at the schedules.
  2. There is an independent issue going on when the 3/13 system switches to debug mode. Fixing the schedule might sidestep the issue. It also doesn’t happen every time as evidenced by the 6/13 switch to debug mode on other hardware that does get to completion.