TVMC - UMA Integration

Let’s discuss how UMA could be integrated into TVMC. I did a first brute force attempt to figure out the challenges.

CLI

A simple approach would be to just give TVMC an additional argument, e.g. --uma-backends that adds directories to scan for UMA backends. An issue here is that this argument would have to be treated in a two-phase way to make the TVMC argparser recognize the target and target attribute commands.

Every found UMA backend would be loaded as Python modules, their class is constructed and registered with backend.register().

Example implementation (thanks to @cgerum for the uma_loader code):

Multiple targets

TVMC currently does not support multi targets. Here we would need to discuss how the syntax should look like for multi targets. For example, a partitioned UMA mod requires the following:

This syntax would then have to be implemented in the tvmc/target.py:parse_target function. As far as I understand, a string as vanilla_accelerator,c would currently create tvm.target.Target("vanilla_accelerator", host=tvm.target.Target("c")). A suggestion would be ["c","vanilla_accelerator,c"]. Alternatively we could make --target an action="append" argument to allow multiple targets.

Instead I implemented a simple hack for testing. If given the target string {}, I construct the multi target manually. (Sidenote: The JSON string parsing does not work because target_host is undefined on that branch in target_from_cli)

TkLLv0EwoL

Additionally, compile_model needs to be able to handle this multi target:

NhQOjJQ5mX

Partitioning

Finally, we need to invoke the mod = backend.partition(mod) interface of the UMA backends. For testing, this code just invokes them for all the backends, but of course we would only do this for the selected backends, which we could query from the selected targets.

neAtU0iyWi

I didn’t quite understand the extra_targets here. It seems they are treated similarly by calling a partition_function, but as far as I understood, these targets are not properly registered in TVMC.


I tested this in our MLonMCU flow and everything seems to work fine. A given model is correctly partitioned, rewritten and using the accelerator function in the generated C code.

Please let me know your comments and suggestions.

@MJKlaiber @cgerum @paulpb @PhilippvK

2 Likes

@r.stahl Regarding the multi-target discussion, I would recommend you to look into what was @mbs-octoml propesed (and implemented) for Collage (see RFC: https://github.com/apache/tvm-rfcs/pull/62) since that one seems to deal with the same topic.

Thanks @r.stahl . Great approach. You are right, tvmcsupport for UMA is really one of the important missing features.

With regards to the CLI:

  • I think adding --uma-backends makes sense as you proposed it. Additionally, we could add the content of the an enviroment variable $UMA_BACKENDS if the directory is not provided via CLI
  • the syntax for a target like ["c","vanilla_accelerator,c"], seems a bit hard to use from CLI. What about adding the multiple target case automatically?

Proposed CLI syntax:


tvmc compile \
--target "vanilla_accelerator" \
--uma-backends "/tvm/my_uma_backends" \ # OR via $UMA_BACKENDS
--input-shapes "data:[1,3,224,224]" \
--output resnet50-v2-7-tvm.tar \
resnet50-v2-7.onnx

Whenever the --target is found by in the --uma-backends step, we add the C target and the uma target as you showed:

At least, as long as we only want to use one UMA backend this should work.

What do yo mean by extra targets?

@PhilippvK Are you referring to this PR? https://github.com/apache/tvm/pull/11382 As far as I can tell, it does not define a textual representation for multi targets. But yes, the canon_multi_target_and_host helper originates from there.

@MJKlaiber Thanks for the feedback. I did not find any use of environment variables in TVMC besides the $TVM_CONFIGS_JSON_DIR, so I’d appreciate some feedback from TVMC veterans on adding $UMA_BACKENDS. @leandron @gromero

Good idea about just always using the multi target, to avoid the multi target challenge on the CLI entirely. This should not have any downside as far as I can tell.

I was referring to this regarding extra_targets: https://github.com/apache/tvm/blob/main/python/tvm/driver/tvmc/target.py#L371 and https://github.com/apache/tvm/blob/main/python/tvm/driver/tvmc/compiler.py#L300 . I dug a bit deeper and found this related PR: https://github.com/apache/tvm/pull/7304 and this discussion post: BYOC Partitioning Support on TVMC by @leandron . So this seems related to BYOC partitioning. Maybe there is some room for unification here.

1 Like

Hi @r.stahl,

Interesting to see people coming back to the tvmc flow :slight_smile:

TVMC Multi-target

I’m curious what you mean by tvmc not supporting multi-targets? We commonly use:

tvmc --target=ethos-u,cmsis-nn,c --target-cmsis-nn-mcpu=cortex-m55 --target-c-mcpu=cortex-m55

I see no reason you couldn’t equally have:

tvmc --target=vanilla,c --target-vanilla-topping=sprinkles

UMA Backend Registration

It’s probably important to figure out how to make this a bit more user friendly than --uma-backends as that mentions a specific implementation detail in UMA. Potentially we can use something like --dynamic-targets <dir> or --extra-targets <dir> as these are essentially dynamically registered Targets ?

Fully support @MJKlaiber’s suggestion to also have an environment variable :smile_cat:

1 Like

Thanks @r.stahl for bringing this forward.

I agree that this proposed API should be doable, and would be a preferred way of integration.

The current implementation searches for all implementations in a package “uma_backend” on the python path. Using a well known python package has some advantages:

  1. It encourages a proper packaging of backends as python packages

  2. makes them distributable through pypi

  3. Allows dependency mangement using the authors required tvm versions

  4. Has a defined way of combining backends from multiple sources. e.g. making uma_backends a Namespace Package

  5. There is already a defined although slightly clumsy mechanism for discovering backends in non standard locations:

    PYTHONPATH="/my/backend/location/:$PYTHONPATH" tvmc --target=vanilla,c
    

Using the python module system for tvmc extensions although has some caveats IMHO:

  1. MIght be considered slightly less convenient for the Quick and Dirty backend
  2. It ties the extension mechanism to python as the extension language, this is IMHO totally consistent with UMA design principles, it might also be OK as an extension mechanism for tvmc
  3. There might be some security concerns, but these probably exist with any third party extension mechanisms, without a proper code-signing, review infrastructure

@cgerum, apologies, completely missed the python package thing - given tvmc is written in python it seems reasonable to expect modules for it to also be written in Python. The only remaining issue I have with it is specifically calling out UMA to the user, who I doubt would want to figure out what UMA is or why they’re interested in it, thus potentially we can set UMA as the standard Target plugin system in tvmc and still use the more generic options?

Sorry @Mousius, I just wanted to add the possibility of not adding any uma/extension specific command line interfaces.

If we consider this possibility, one could aready use as the basis for a generic tvmc extension mechanism, By at least renaming the well known path from uma_backends to something along the line of tvmc_extensionsions, and defining a minimal abstract interface for tvmc extensions.

@cgerum I think that’d be great! Is that an acceptable amount of work to tie into this though? We could introduce an --experimental-uma-backends flag to begin with and then move it to --tvmc-extension <x> later? Or potentially we can put up a first go under --experimental-tvmc-extension so as not to block the valuable UMA part of this?

@Mousius @cgerum @r.stahl Having an extension interface for TVMC is definitely desirable IMHO. Should we start a new discussion/PreRFC for collecting requirements etc. now or just see how this works out for the UMA-usecase and discuss how to improve/generalize it more and more afterwards?

In the spirit of moving forwards iteratively, I’d suggest that if it’s easy to change uma_backend to tvmc_extension we put it behind a --experimental-tvm-extension with something functional, if that’s a bit tricky, we can do --experimental-uma-backend :smile_cat:

No need to block this on future unrefined work if it’s under something that explicitly demonstrates it’s experimental to a user, the integration also looks reasonably straight forward given the existing tvmc support for multi-target?

In parallel with that, it’d be great to start the formalisation of what an extension is :smile_cat: I believe @cgerum’s proposal of the namespace package is pretty good to formalise into an RFC? It’d be good to consider whether the same infra could work at the tvm and relay.build Python APIs at the same time so as not to create functionality solely in tvmc.

Thanks for all the feedback!

Based on this input, I have created a first draft patch to add the extension mechanism to TVMC:

Then one can use the following to add a directory containing a tvm_extension python package:

tvmc compile MODEL --experimental-tvm-extension path/to/ext

There, I put a simple extension that defines a UMA backend:

from tvm.driver.tvmc.extensions import TVMExtension
from .backend import VanillaAcceleratorBackend

class MyExt(TVMExtension):
  def __init__(self):
    self.backend = VanillaAcceleratorBackend()

  def uma_backends(self):
    return [self.backend]

At this point it is still unclear to me how the existing TVMC CLI could be used to allow for multiple targets. In particular, what would vanilla-topping=sprinkles be and where could it be defined so that a second c target is passed to relay.build? @Mousius @cgerum

Hi @r.stahl,

I think you just need to hook it into the glue logic that invokes the graph partitioner:

Something like:

ut_backend.target_name: {
  config_key: None, // Can't remember if you need this for CLI options
  pass_pipeline: ut_backend.partition
}

Methinks we can have an additional dynamic_registered_codegen block and then merge them in get_codegen_names and get_codegen_by_target ? Should make it easier to test whether it’s been registered if we don’t overlap them :thinking:

Hi @Mousius

Not sure if I follow your suggestion. As far as I understand, your suggestion helps to unify the partition calls in compiler.py:compile_model. I added this to the patch:

However, this still fails to run with the following error:

  File "/home/user1/uma/tvm/python/tvm/driver/tvmc/compiler.py", line 454, in build
    return relay.build(
[snip]
  13: tvm::relay::backend::AOTExecutorCodegen::Codegen(tvm::IRModule, tvm::relay::Function, tvm::runtime::String)
[snip]
  File "/home/user1/uma/tvm/python/tvm/relay/backend/contrib/uma/api/lower.py", line 136, in relay_to_tir
    func = self._lower_relay_to_tir(func)
  File "/home/user1/uma/tvm/python/tvm/relay/backend/contrib/uma/api/lower.py", line 75, in _lower_relay_to_tir
    if target.kind.name != compiler_attr:
AttributeError: 'NoneType' object has no attribute 'kind'

If I just patch the target in compiler.py:compile_model to:

tvm_target = [tvm.target.Target("c"), tvm.target.Target("vanilla_accelerator", host=tvm.target.Target("c"))]

everything works fine.

@r.stahl do you have an example extension you can provide?

Sure, I have uploaded it here: https://github.com/rafzi/tvm_vanilla_ext

It is the unmodified Vanilla code with the added minimal extension definition (ext.py) I posted above.

1 Like

Hi @r.stahl,

Hmm, I think the issue is that we’ve avoided using the global Target.current() in the entire flow, so it’s not actually set - which is good in one sense but not helpful here :smile_cat: . Though I’m unsure whether you need it, do you know why this isn’t simply target = tvm.target.Target(compiler_attr)?

@Mousius Thanks for looking into this! With your proposed change, everything seems to be working fine. I’m not sure about the rationale of that snippet. @cgerum Do you know more and would that change work for you? (introduced here: Use better function name for te_lowering and annotate current target … · MichaelJKlaiber/tvm@0f0b1bf · GitHub )

Updated changes:

Alright, then I’ll add some proper testing and submit a PR :slight_smile:

PR: