Like the title asks. I didn’t see a tutorial for it. And also, how is ‘warp’ different from ‘local’ + virtual thread? I feel like they are subtly different but it seems that they both work for computing and loading data with stride.
Thanks!
Like the title asks. I didn’t see a tutorial for it. And also, how is ‘warp’ different from ‘local’ + virtual thread? I feel like they are subtly different but it seems that they both work for computing and loading data with stride.
Thanks!