Wrong output from Cuda `vision.multibox_transform_loc`

fantasyRqg · September 16, 2021, 9:59am

vision.multibox_transform_loc run on Cuda give wrong output .

def multibox_transform_loc(
    cls_prob, loc_pred, anchor, clip=True, threshold=0.01, variances=(0.1, 0.1, 0.2, 0.2)
)

`vision.multibox_transform_loc` output

Gpu output and Data dump(from caffe blobs) diff:

Shapes

Three output above line 5587 is same. On image 3 (correct output) below line 5587 values is extramly small.

Any suggestions would be greatly appreciated!

This repo can reproduce the bug on tvm repo commit 4c77bae772ad68f3dc4dda009384cb65af9dfaec

vinx13 · September 15, 2021, 5:54pm

How do you run these three tests? Are the CUDA kernel for multibox_transform_loc the same?

fantasyRqg · September 16, 2021, 9:10am

This repo can reproduce the bug on tvm repo commit 4c77bae772ad68f3dc4dda009384cb65af9dfaec

fantasyRqg · September 16, 2021, 9:14am

Exact same CUDA kernel

vinx13 · September 21, 2021, 7:27pm

Thanks. I haven’t found exact reason, but seems this is related to some optimization pass. Setting opt_level=0 can produce correct result.

fantasyRqg · September 22, 2021, 9:41am

Thanks for test this problem. I’m not familar with Cuda, try to learn something to understand what’s going wrong.