I am working on optimizing ResNeSt, and I have met the denormal issues. The following is the solution, which sets denormal numbers to zero.
But I am confused that where should I put the codes. There are some requirements:
- Target should be checked. This solution can only be applied to CPU.
- Developers can easily set
flust_to_zero
on or off. - The function can only be called once.
//DAZ
_mm_setcsr( _mm_getcsr() | 0x0040 );
//FTZ
_mm_setcsr( _mm_getcsr() | 0x8000 );