I’m trying to run resnet-50 model on Adreno GPU but performance isn’t coming out to be that good. I’ve built TVM with opencl as backend. I’ve tried changing the texture_spacial_limit but it’s not having any effect on the inference time. I’ve few doubts regarding TVM for adreno gpu :
-
Does tvm know how to efficiently use texture processor that’s there within Adreno?
-
How to efficiently run NN on adreno gpu without OpenCLML?
OS : QNX 7.1
Target : Qualcomm Board
@srkreddy1238
Spatial limit affect texture memory compatibility only. Most of Qualcomm platfoms its 16K. We shouldn’t be increasing beyond hardware capability. Increasing this limit need not be necessarily increase performance.
Texture enablement happens with target being “opencl -device=adreno”. Auto tuning refines the kernels further.
Thanks for your reply
I’m getting following results with TVM on QNX7.1 os with Adreno 663 GPU for resnet-50 model :
Inference Time |
Max_threads_per_block |
Max_num_threads |
Max_shared_mem_per_block |
Texture_spatial_limit |
42.78 |
8 |
256 |
8192 |
16384 |
42.49 |
8 |
256 |
8192 |
8192 |
Increasing texture_spatial_limit had no impact on inference time. Is this observation in sync with how texture processor should work?
With tuning I’m getting improvement in inference time but still by default TVM is unable to use full capability of texture processor.
Hi Varun,
Increased texture spatial limit only allows tensors with higher dimensions to use clImages (accessed texture hardware block). Resnet-50 may have tiny tensor shapes that fit into any of these limits (8K or 16K). Hence, performance may not have impacted here.
Thanks for the reply @srkreddy1238
While performing inference of resnet-50 model whose input shape is “NHWC”, I’m getting following error on Adreno GPU :
varun/tvm/src/runtime/library_module.cc:76: InternalError: Check failed: ret == 0 (-1 vs. 0) : TVMError: OpenCL build error for device=7984afb8
Error: CL_OUT_OF_HOST_MEMORY
The same model with input_shape “NCHW” is running fine on the Adreno GPU. How to resolve this error? Does adreno only supports input_shape to be in format “NCHW”?
@sanirudh @tqchen