Enable USMP by default in AoT Executor with runtime=crt

mehrdadh · February 22, 2023, 1:42am

Hi everyone,

One of the main challenges of running ML models on tiny devices is managing memory allocations at runtime. microTVM is also facing the same challenge and often it results in out of box tutorials/demos failures specially when they are targeted to run on new device. Ahead of time (AoT) compilation helped a lot with this issue by allocating most of the runtime memory requirements at compile time. In addition to AoT, Unified Static Memory Planing (USMP) work enabled a better approach to manage all memory allocations for model and runtime at compile time. At the current state of TVM if you build a microTVM project using AoT executor, if you build for a standalone project and enable USMP, it would require no memory allocation from user/developer since all the memory allocations are already handled at compile time. This is great but is not the default route when we build a microTVM project.

Since using USMP is critical to improve the microTVM developers experience, we propose to enable USMP by default when AoT Executor is selected and runtime is C runtime (CRT). CRT is mostly used by microTVM use cases like CRT host, Zephyr, Arduino. This change would not affect other build flows in TVM.

I’m happy to discuss this change here and hear your feedbacks.

Mehrdad.

mehrdadh · February 22, 2023, 1:43am

cc @areusch @alanmacd @MJKlaiber @Mousius @leandron

Mousius · February 22, 2023, 11:30pm

This is long overdue, I don’t think the other default allocation strategies are ever better than USMP for any use case due to the overheads introduced by the page management and the limitations of the stack allocator.

As we discussed earlier, it’d be great to enable it for the AOT executor whenever possible and let people override it if they need to. Similarly, setting the runtime to C whenever we use AOT seems like a sensible default behaviour.

Thanks for raising this @mehrdadh!

tqchen · February 23, 2023, 1:08am

I agree that static memory planner would be the right default choice for embedded settings

Mousius · February 27, 2023, 9:06am

Hi @tqchen,

Wouldn’t it be confusing and difficult to maintain to make it specific to embedded? Do you see a reason why USMP couldn’t be the default for any user of the AOT executor?

tqchen · February 27, 2023, 8:30pm

Right now aot executor is coupled mostly commonly used embedded setting. perhaps “minimal runtime” or nostd runtime is a better term here. I do not see issues making it default of crt. So when stating that I meant to refer to the usage of such settings.

The nostd runtime itself however comes with (intentional) design choices as such they do not take full features of the tvm runtime like object system, as a result do not support some of the related features around object system, closures etc, dynamic memory allocation and zero-copy sharing with external project (like pytorch). So here i am talking about common use, not to restrict the possible use cases of such setting.

Of course people should be able to use such nostd runtime in non-embedding setting as well when the restriction fits their usage scenarios, there is no question of that. Just like rust developer can use nostd runtime to run webserver(but it is less common).

manupak · March 6, 2023, 10:55pm

Yea, its about time we did that.