Let me thank you all guys for the interesting discussion.
The main reason I implemented fpm
as intrinsic was because I thought it similar to other operations like mul, add, sub, etc… (and I thought that different vendors might be plugging in their intrinsics to implement fpm
)
But @kparzysz, you are absolutely right. If I implement this as a TOPI operator, it won’t be usable within a compute
node. I didn’t think about that.
About writing this as fpm(x,y,s)
, I am not sure I can do that. The aim of the fixed point multiply is to multiply an int32
number x
by a floating point number expressed as int32(round(2^30*M))*2^s
, where M
and s
are the output of [M,s] = frexp(f)
, f
being a float32
data. In fpm(x,m,s)
I expect m=round(2^30*M)
and s
to be the shift that comes from frexp
. In other words, I expect m
and s
to represent a floating point number with a given fixed point representation.
@tqchen, thanks for the explanation. Now everything is clear. However, we can implement the first optimization directly in Relay (if (is_power_2(scale)) shift else fpm
)), and I guess we might implement the second optimization directly in TIR or in TOPI
PS I just uploaded the PR, please have a look