Hi. I have a tvm module, which using some kernels and consist of 2 FC layer and activation only.
I found the func by name in python, and test with exported fp16 module, with bumpy fp16 array, it can forward and printout values.
But in C++, I can not get the values, the function can not pass, just segmentation fault.
I can make sure, python values like this:
a = tvm.nd.array(np.random.rand(batch, K1).astype(np.float32), dev)
w0 = tvm.nd.array(
np.random.uniform(size=[K1 // kc1, hdims[0] // stride, kc1, W, stride]).astype(
np.float32
# dtype
),
dev,
)
coff = tvm.nd.array(np.random.uniform(size=[batch, 4]).astype(dtype), dev)
# coff = tvm.nd.array(np.random.uniform(size=[batch, 4]).astype(np.float32), dev)
o1 = tvm.nd.array(np.zeros([batch, hdims[0]]).astype(np.float32), dev)
fc1(a, w0, coff, o1)
it can forward, if I only set coff to fp16, keep others as float32.
I did exactly same in C++, but segmentation fault. Anyone could give a hand to test where might caused the problem?