Performance regression with batch size > 1

Hi,

I’m currently testing the same model (ResNet101 backbone) with different batch sizes and it seems that performance worsens if batch size = 2 compared to running the model twice with batch size = 1.
I’ve already tried to use autotvm with batch size = 2 but still the result is as above.
In my case the performance regression in > 10%.
Is this normal? Any workaround?
Thanks.

2 Likes

Has anyone ever experienced this issue?

Currently,many schedule assume the batch is 1. So we parallel axis is C (assume the layout is NCHW, not N) and not split N axis. This is one reason why we don’t get better performance on large batch size. What hardware platform are you using?

Currently I’m using x86_64, but I may try Android (arm64-v8a) as well in the future

x86 schedule doesn’t consider batch > 1 as I said before. You could refer conv2d.py in topi/x86 to see.

Ok, thanks for the insight