I still want to see winograd enabled for NCHW too, since NHWC has many issues due to a lack of focus from the community.
For example, I had to spend some time to support roi_align
on NHWC. Although faster rcnn works great on NHWC after my roi_align
fix, maskrcnn, due to its dynamic batch conv2d and conv2d_transpose, do not seem to work on NHWC at all. Trying to compile dynamic, NHWC conv2d gives an obscure error (something related to shared memory). Worse, there is no NHWC conv2d_transpose operator.
So I think unusual workload like that is where NHWC would easily break. Other issues include too many layout_transform
, obscure runtime error from shape func that assumes NCHW etc.