If I want to store some small size constant arrays (like a slice of conv weight) in the CPU’s cache and keep it as long as I need, do we have some primitive/function to do such things?
For instance, we have ir_builder.allocate to indicate the memory scope for GPUs, it could be either shared, global or local.
ker_buf = irb.allocate("float32", (KH*KW*CI,), name="kernel buffer", scope="global")
I tried to use this on CPU but it seems changing the scope doesn’t make any difference.