[mini-RFC] Name mangling in AOT

Introduction and Motivation

As we introduced an Ahead of TIme (AoT) flow in TVM in this PR, we are able to compile a single model into a the Model Library Format (MLF) and compile it to run on an embedded board.

It’s hard to bundle more than one model together, since we will end up with different name clashes (the name of the network descriptor, the name of the main function, and potentially the name of the operators if they appear in multiple networks). In order to avoid this, we think it would be nice to let the user specify a model name to append to all the global names generated through the compilation process. Example of global names are:

  • tvm__run_func for the main function
  • network for the network descriptor
  • fused_conv2d_shift_cast for the operator (the operator name is truncated to 80)

Moreover, none of those global symbols start with an underscore (which means user might specify similar function names, thus having clashes)

Proposal

We are proposing to prefix by default all the global symbols with a __tvm prefix. Moreover, we would give the user the opportunity to specify the model name. If the user specify MYNET, this would be the end result:

  • MYNET_run_func for the main function
  • MYNET_network for the descriptor
  • MYNET_fused_conv2d_shift_cast... for the operator (the operator name is truncated to 80)

In case the user does not specify any model name, the result would be:

  • _tvm_run_func for the main function
  • _tvm_network for the descriptor
  • _tvm_fused_conv2d_shift_cast for the operator (the operator name is truncated to 80)
4 Likes

cc: @Mousius @manupa-arm @areusch

I think adding a _tvm prefix is good. On the other hand, we need to be mindful of backward compatibility issue. The DSOModule loader can only lookup a single prefix by default.

This will cause compatibility issue that we need to be careful about during an upgrade(perhaps supporting DSO to lookup both and deprecate the legacy one in next release cycle)

i agree with the general concept. two questions about the name:

  1. is it ever possible to remove the _tvm part? “prefix” implies that the user can pick the name of the function starting at char 0.
  2. the prefix here looks like a “model name.” Is it worth it to just call the prefix the “model name?”

Hi Andrew, thanks for your comments. Yes, both are fine for me ! I will edit the RFC, let me know what you think

FYI, PR is here: [AOT] Name mangling in AOT by giuseros · Pull Request #8014 · apache/tvm · GitHub

Hi @areusch @tqchen @giuseros

I think its best to use “tvm” prefix nonetheless. – so we dont pollute a namespace based on a user given variable.

I dont follow why a “prefix” necessarily mean user being able to select it? If “prefix” is not the right term we should not call it a prefix. The goal of this RFC is to propose mechanism to reduce namespace pollution. Therefore having “tvm” states that this belongs tvm codegen’d artifacts and moreover allowing a model name allows to further categorize artifacts of multiple compilations.

Therefore, I’d suggest we use :

  • tvm_MYNET_run (lets drop the func :slight_smile: )

and (if a model name is not given)

  • tvm_run

@tqchen , I’d assume if a model name is not given the second option would be backward compatible DSOModule loader ?

(edited to reflect @mjs suggestions)

1 Like

Identifiers that begin with underscore are reserved by the C standard. Conformant C code should not use them, dropping the _ and using just “tvm_…” would be conformant.

1 Like

I agree that having a common prefix is helpful in the dso landscape to clearly identify function generated by tvm. To faciliate discussion, consider the following code

m = tvm.runtime.load_module("x.so")
# Option P0: require explicit query using tvm_run
run = m["tvm_run"]
# Option P1: the underlying symbol is "tvm_run" 
run = m["run"]

I believe we are still talking about P0 atm for simplicity(direct correspondence of symbol and packed func name), but allow the AOT generator to append a prefix(like @areusch 's comment of prefix starting from char 0). My main comment of backward compact is when we start to choose P1. If we go with P1, then we will need to put more thoughts into it.

@tqchen @manupa-arm @mjs @giuseros great discussions so far!

Identifiers that begin with underscore are reserved by the C standard. Conformant C code should not use them, dropping the _ and using just “tvm_…” would be conformant.

I agree with this. Do we need to consider distinguishing this prefix from that used within TVM itself? e.g. tvmgen_ so that stacktraces when compute is launched from the shared library are clear.

I dont follow why a “prefix” necessarily mean user being able to select it? If “prefix” is not the right term we should not call it a prefix.

Yeah this is just me stating that we were proposing tvm_<model_name>_<function_name>, where <model_name> is termed a prefix. I’d call tvm_<model_name>_ a prefix here.

I agree that having a common prefix is helpful in the dso landscape to clearly identify function generated by tvm. To faciliate discussion, consider the following code

m = tvm.runtime.load_module("x.so")
# Option P0: require explicit query using tvm_run
run = m["tvm_run"]
# Option P1: the underlying symbol is "tvm_run" 
run = m["run"]

I believe we are still talking about P0 atm for simplicity(direct correspondence of symbol and packed func name), but allow the AOT generator to append a prefix(like @areusch 's comment of prefix starting from char 0). My main comment of backward compact is when we start to choose P1. If we go with P1, then we will need to put more thoughts into it.

@tqchen, in this case, the main function being queried should be the factory function for Module-based Model Runtime Interface, no? In that case, it seems reasonable to require:

m = tvm.runtime.load_module("x.so")
# User looks up module via prefix
executor = m["customprefix_tvm_modelname"]()
# Or perhaps TVM can fall back to the standard one
executor = m["modelname"](). # Looks up tvmgen_modelname

executor["set_input"]("foo", bar)
# ...

It should be possible to include the prefix when looking up the module-specific functions. It’s also worth pointing out here that this discussion is for the case of loading a module using the C++ runtime. With the C runtime, I think it’s expected that the user chooses <prefix> to match their firmware implementation.