- Four parameters of
q_multiply_shift: x, y, q, s, let’s assume q = 31, s = -1, and a value ofx * yas follows:
2nd rounding
| 1st rounding
| |
V v
bit idx: 63 62 ... 32 31 30 ... 0
bit val: s s 0 1 # s for sign bit,
# { bit_62 - bit_31 } will be kept
# as the result (a 32bit integer)
# of neon.sqrdmulh
The first rounding will produce a carry:
2nd rounding
|
V
bit idx: 62 ... 32 31
bit val: s 1 # The 2nd rounding is done by neon.srshl
# (shift left "s = -1" with rounding)
# Note: bit_32 will accept the "carry"
# out of "rounding(bit_31...)"
This carry will then be further propagated to bit 32, which produces a different value to the DEFAULT path(aka rounding once).
Above can be fixed as follows:
- using
sqdmulhfors < 0 - and use
sqrdmulhfors == 0