How to use compute to describe cast [fp32 2 fp16]

Eg: input is (128, 16) dtype=fp32, i wann obtain the fp32 data in fp16 mode that output is (128, 16, 2) dtype=fp16, but i don’t know how to deal these. res = tvm.compute((128, 16, 2), func(), name=“fp322fp16”) while input is (128,16) dtype=fp32 res = tvm.compute((128, 16), func(), name=“fp162fp32”) while input is (128,16,2) dtype=fp16

Is there anyone meet the same problem or can help me figure out it. I would appreciate it very much~

Hi, I’m not sure exactly what you are asking, but there is a casting function.

relay.cast to access at relay level

topi.cast at topi level

Thanks. I wann reinterpret a fp32-Tensor by fp16 mode. If shape of fp32-Tensor is (128,16), output-shape would be (128, 16, 2) after reinterpret. I wann use a compute to describe this operator.

Ah so a 32 bit floating point number would become two 16 bit floating point numbers where the first 16 bit number is the first 16 bits and the second number is the other 16 bits? That’s a weird operation, what is the use case?

umm, you known some hardware instructions maybe not support 32-bit but 16-bit. So i wann use the 16-bit scheme to process 32-bit data, that need reinterpret tensor in tvm.

It seems to me that the compute you want isn’t really a cast, which will lose the information of a half bits while yours won’t. So to me, it’s more like the operations of bit-level shifting and masking, like the following?

a = e && 0xFFFF0000
b = e && 0x0000FFFF
out = a || b

IMHO, this operator would be more suitable to be done at the codegen level. For example, the codegen for your hardware could accept a (128, 16) tensor in FP32, and perform the above operations to let it store in memory with the desired alignment.