Performance issue for running in Hexagon DSP target

Hi, I’m working on deploy a resent-18 model to DSP target. (Snapdragon XR2)

By following the tutorial here, I can get the android_launcher work E2E in the real device. But the perf is really bad, it takes about 228.6 seconds to finish.

I also tried to deploy this model using CPU, it takes only about 170ms for an E2E run. I know DSP might not have well-tuned logs in TopHub, is it expected to cause such big gaps?

I’m new to both DSP and TVM, hope someone could give me a high-level guide on how to optimize this model on DSP based on the current situation.

Thanks a lot in advance!

I think the model op’s data type is fp32? Mabey the performance in DSP using FP32 is extremly bad.

1 Like

This is happening because the code is not really optimized very well. We’re working on improving the schedules and codegen, but there is still a lot to be done.

Hi @Fan, can you share something about how to copile a model for Android device with hexagon dsp backend?