When I use version 0.10, the latency(without auto-tuning) for my dbnet model is 1s. When I use code of the main branch, the latency is 50s.
This phenomenon happens on both backend rocm and cuda.
I am wondering how does this happen?
When I use version 0.10, the latency(without auto-tuning) for my dbnet model is 1s. When I use code of the main branch, the latency is 50s.
This phenomenon happens on both backend rocm and cuda.
I am wondering how does this happen?