[AutoScheduler] Using AutoScheduler with custom hardware

Hello,

since the tutorial only showcases examples for CPU and GPU, I want to ask if using my own accelerator with AutoScheduler is feasible? I am not sure if there is a way to annotate the schedules in such a way that tells the algorithm when it needs to load data from DRAM into the on-chip memory or vice versa.

My accelerator supports instructions for moving data between on- and off-chip memories and perform computations. Both of these are quite low level and supports only movement or computation of an NxM matrix. It is possible to do this via AutoTVM, but I want to investigate if there is a way to increase performance even further.