[RFC] Unified Static Memory Planning

hi @manupa-arm,

This proposal aims to introduce TIR → TIR pass as illustrated above which translates pre-USMP TIR to post-USMP TIR – eventually. Therefore, we are not planning to modify GraphPlanMemory.

Ok—when you say “load a packed function of the tvm_main instead of json,” do you mean simply that GraphExecutor#run could just call tvm_main? If we make GraphExecutor effectively consume the results of this interface, seems like that would effectively change SetupStorage to issue basically 3 (or maybe a few more) allocate calls:

  1. for the input data (optional)
  2. for the CPU workspace pool
  3. for the output data

there could be additional calls if there are additional e.g. accelerator buffers

I think such a proposal might work to unify the memory planning around this AoT-based approach, but there are some cases which might mean we need to relax this proposal a bit–for instance, the part about passing only the memory pools to operators. it may be that in order to support overriding parameters at runtime (which GraphExecutor currently allows), we need to keep with passing individual function arguments, but these can be arranged (by AOT or GraphExecutor) to merely be offsets into the memory pools (or then be overridden to user-supplied tensors).

Yes, this is something a bit loosely defined in this proposal. Yes, it could be a PackedFunc – however, I’d imagine we would assume the memory planning algorithm to be compute heavy and would require to be performant. Therefore, we are inclining towards having something like TVM_REGISTER_PASS_CONFIG_OPTION to accept a String to choose the algorithm while providing a default. In the pass, we could maintain a String to C++ function ptr map. WDYT ?

I think that we should use PackedFunc where we would like to provide pluggable infrastructure. It should still be possible to provide compute-optimized versions in c++. And it’s still possible to implement registries with prefixes to function names e.g. relay.memory.usmp.

Sure, actually there are two orthogonal main choices here (its just combinations made them to be 4 :slight_smile: ). Moreover, feel free to suggest additional options as well.

I’m more wondering what the arguments to such operators might be–name:key=value type of thing to support attributes on memory pools, or ?