I see, indeed. For this case, the code should already be efficient enough. The temp will get narrowed into a size 1 buffer and then into register during codegen
I see, indeed. For this case, the code should already be efficient enough. The temp will get narrowed into a size 1 buffer and then into register during codegen