I’ve followed the tutorial Using External Libraries in Relay to utilize cudnn.
Previously, the operator number is 117, now it becomes 62. I wonder if the estimated total latency printed by Ansor includes everything, or it just contains those 62 operators. Cause I find a huge latency improvement after using cudnn.