In the context of adding new features for Adreno (OpenCL runtime) like
- clImage1D instead instead of clBuffers as default allocation (global scope) as Adreno can benefit through texture path.
- Recordable queue support
- More tight integration between CLML and OpenCL while context switch (Reuse CLML/TVM allocated cl mem objects across with out additional copies)
Features like these need changes across Codegen and also the runtime.
Codegen part is a bit easy as defining a virtual target “adreno” as shown below and extending CodeGenOpenCL
as CodeGenAdrenoCL
to generate sampler based load/store for 1D buffers can achieve this. Here we reuse most of the OpenCL codegen here.
TVM_REGISTER_TARGET_KIND("adreno", kDLOpenCL)
. . . . .
.set_default_keys({"opencl", "gpu"});
On the runtime we don’t have information to differentiate regular opencl vs adreno based memory allocation strategy and management.
I see two options here
- We can define
kDLAdreno
as native target - Alter graph runtime to supply the device specific options via the graph json attributes.
Any thoughts ?
Thanks, Siva