Yes, I agree. We should have a separate strategy for cuda conv2d as it contains implementation that only applies to cuda. But we should reuse the implementations that are generally applicable to gpu target.
I can prepare a PR to fix this in a few days.