I tried to measure my model using metal which is recommended for macos, however the result is very bad , as shown below
This troubles me a lot, because the result on cpu is faster than it which is about 50 ms.
With no idea, I tried changing metal to opencl , the result is much normal
the only difference in my code is just the target setting
In my opinion, either metal needs extra settings or metal on m1 has bugs now.
Anyone know what causes this problem?