[RFC] [uTVM] AOT optimisations for Embedded Targets

Glad to have your feedback @areusch :smiley_cat:

I agree with you wholeheartedly, we need to be careful with naming here; based on your comments it might be good to pick something like --packed-functions which defaults to running MakePackedAPI? Maybe --packed-internal-functions or --packed-operator-functions?

Given the micro entrypoint is a C2 function, I’d propose we don’t add a C0 behaviour to the option. Is the use-case here to provide the C2 entrypoint for the application and also providing a C0 wrapper for operators in the same bundle with a different path in the application to allow calling them directly?

I thought on this a bit and potentially we should change the name of --micro-entrypoint to just --entrypoint and default it to packed which generates the packed function interface for module loading, a better name may be module to reflect the entrypoints purpose? This should provide us with the ability to name the relevant interface.

As for implementing this in TIR, I believe there’s a few limitations in how much TIR understands structs which we’re proposing in [RFC] [uTVM] Embedded C Runtime Interface. Though I think we’re aligned in the view this should be done via code generation rather than this initial solution - I did have an implementation that filtered the passes based on the entrypoint function originally so I don’t think this is too hard to achieve.

This may be a place where we’re agreeing just in slightly different ways; the current standalone_crt has 82 files in it, providing a subset of these for an initial user scenario of taking a model and compiling it for deployment seems sensible to get people going. After the initial deployment of a model, integrating the whole standalone_crt may be necessary, at which point we shouldn’t prohibit that, it just takes a bit more understanding of the pieces you might need? In which case, my hope is the embedded interface allows that use case of deploying with a few headers and then expanding when your application demands it.

Ideally I’d like to land the micro entrypoint as a separate PR once 8023 lands; then we’ll have both pieces to underpin the runtime interface (even if these interfaces change). That allows us to focus on the interface and discuss cleaning this up as you’ve suggested :smile_cat: