I just wanted to note that you should be able to see sizable performance improvements from running FP16 on Apple M1. Check out the relay.transform.ToMixedPrecision
pass to easily convert your model to FP16. I’ve also personally found that adding -mattr=+fullfp16
to the target string makes a big difference.
1 Like