Why "c --unpacked-api" not work anymore?

miraclezqc · January 5, 2022, 1:17pm

Hi, here is my code. self.mod = tvm.build(sch.mod, target=tvm.target.Target(“cuda”, host=“c --unpacked-api”)) self.mod.save(“c”, fmt=‘c’)

I remember that the “–unpacked-api” is still work in the commit of 0.8 releases（ 7b3a22e

Compare）

areusch · January 6, 2022, 7:29pm

cc @Mousius

I think you might need to adjust your tvm.build() call to use executor="graph --unpacked-api". Are you building from 0.8 or from HEAD? this changed just after the 0.8 release (intentionally so that we had some time to address any issues with it).

miraclezqc · January 11, 2022, 2:27pm

My code looks like this

sch = tvm.tir.Schedule(GenerateProblem)
mod = tvm.build(sch.mod, target="c -executor=graph -unpacked-api=1")
print(mod.get_source())

And with latest commit, the source code is still as showed in the question. But if I checkout to the version of v0.8.0, the ‘unpacked-api’ option is work.

SebastianBoblestETAS · January 13, 2022, 4:59pm

@UlrikHjort

Hi, I experienced the same problem. In its current form, the graph executor does not support “unpacked-api”, right? I added this in src/relay/backend/executor.cc line 96 .add_attr_option(“unpacked-api”) Same as in the aot executor.

Then, the following call worked for me: module = relay.build( module = relay.build( ir_mod, executor=tvm.relay.backend.Executor(name=“graph”, options={“unpacked-api”: 1}), target=tvm.target.Target(“c”, host=“c --link-params --unpacked-api”), params=params ) It is very tricky to find the correct call here because of the way deprecated arguments are handled in build_module.py in function

def _reconstruct_from_deprecated_options(deprecated_params_target):

If any of link-params or unpacked-api is part of the target string, a new executor is created and you lose the options that you had on your original executor. You can therefore also omit options={“unpacked-api”: 1} in the Executor instantiation. Somehow I could not get link-params as an Executor option to work and thus both link-params and unpacked-api need to be in the target string. Hope this is not too confusing, but I was confused myself.

areusch · January 14, 2022, 7:27pm

@SebastianBoblestETAS thanks for clarifying, i think we should probably re-add --unpacked-api to GraphExecutor. cc @Mousius

If either of you have cycles to send a PR I’m happy to review!

SebastianBoblestETAS · January 17, 2022, 8:33am

I created a PR: https://github.com/apache/tvm/pull/9949 The handling of deprecated variables could probably be improved also but I do not feel competent to do that. It could easily break things.

Mousius · January 17, 2022, 12:00pm

Hi @SebastianBoblestETAS,

Sorry about this, I added the logic to deprecate to relay.build but not to tvm.build - I’ll put a PR together with the warnings enabled there as well. Adding unpacked-api to the Graph executor options would involve adding support when the JSON is parsed to run the operators using the correct arguments, which leads me to think we shouldn’t re-add the argument as they’re not compatible and leaves users in a weird state. What is your motivation to not use the Ahead of Time Executor here?

SebastianBoblestETAS · January 17, 2022, 1:47pm

I agree that this is confusing. The motivation for not using the aot is that I have my own AOT solution. An additional benefit of the graph executor is that I can analyze the graph.json outside of TVM. What would your proposal be? An additional drop_graph_to_json option for the aot would probably also work.

areusch · January 17, 2022, 2:35pm

ah I saw these updates after I reviewed the PR @SebastianBoblestETAS put together. Perhaps it makes sense to agree on a strategy here and then iterate on the PR.

I agree the PR needs additional work in GraphExecutor to be viable. It could be somewhat challenging to build calls to unpacked API at runtime within the constraints of the CRT.

We could also implement a graph export by either a) extracting it from the AOT-generated TIR or b) generating it during AOTExecutorCodegen. I think this might be a useful tool, but I’m also not convinced we should do this. Below are some thoughts about the overall problem.

Another approach would be for @SebastianBoblestETAS to parse the AOT-generated TIR instead of consulting the JSON. I think this might be a bit more complex for someone consuming the output of TVM, which seems worse. On the other hand, if you compare between the two approaches above, extracting from TIR is the more stable one in the long run as additional TIR-to-TIR compiler passes (e.g. USMP and anything else that wants to do whole-program) are added after AOT code generation. In this situation, we’re essentially requiring ourselves to translate TIR into a JSON format and then document this transformation enough that users could make sense of it. In that light, I wonder if it is easier to improve the docs of our TIR parsing library and make it easy for folks to consult the TIR rather than some kind of graph.json format.

We could also discuss this on Wednesday at the microTVM meetup too, if folks are interested.

cc @jroesch @tqchen

SebastianBoblestETAS · January 17, 2022, 4:20pm

Sure, a discussion in the microTVM meetup would be great!

areusch · January 18, 2022, 12:27am

great, I’ve added that to the agenda.

areusch · January 19, 2022, 4:46pm

following some discussion, we think it makes sense to add a facility to export the pre-codegen TIR in Model Library Format. I previously created ModelLibraryFormatPrinter to print TIR using consistent variable names. It might be worth looking into tvm.module.Module.AsTVMScript() as well, since that should parse better.

@SebastianBoblestETAS could you take a look into this and create a new pre-RFC thread with a proposal?

areusch · January 19, 2022, 4:48pm

@miraclezqc apologies I think this thread got a bit hijacked. I think the conclusion here was that -unpacked-api is not expected to produce functional code with -executor=graph, because GraphExecutor doesn’t know how to invoke unpacked API. did you have a different use for this? Sebastian also had a use for this that didn’t follow the checked-in code path, so I’d like to be sure we addressed your concern as well.

SebastianBoblestETAS · January 20, 2022, 6:11am

Sure, thanks for the links so that I know where to start Will take me a few days to get started on this I am afraid.

SebastianBoblestETAS · January 27, 2022, 2:16pm

Hi, we had a first look into this. For our purposes it seems actually sufficient to add support to emit the graph.json in the aot executor. What is your opinion on that?

areusch · January 27, 2022, 5:48pm

Two concerns about that:

maintenance of the JSON exporter. It would need to be robust to whatever type of TIR is generated by AOTExecutorCodegen. Since both the generator and the consumer are in the same codebase, it might not be that bad.
encoding the TIR program in JSON. For example, with USMP we would move to the idea of offset-into-memory-pool memory planning. However, graph.json expects a set of storage_id which correspond to buffers not necessarily placed in any particular section of memory. graph.json could limit us there. We could define our own JSON format or declare that we would then adapt the graph.json to accommodate the AOT program. We should make an RFC and solicit feedback from @manupa-arm @mousius about that.

SebastianBoblestETAS · January 28, 2022, 7:44am

Ok, great. I must admit that on a first glance a general TIR exporter seems a very complex task for me to do it. But still, I would be glad to implement it after I have some help in where to start exactly.

SebastianBoblestETAS · February 3, 2022, 4:09pm

Sorry, for the long time that passed by. I would like to understand better what you propose so that I can finally start writing a pre-RFC.

Is it correct that you propose to subclass TVMScriptPrinter to create a TIR exporter? That looks manageable. We already have a CodeGenerator that does this to some extent for individual kernels. I have trouble figuring out where in the workflow I would create that instance and use it. Could you maybe give me a hint here? If this class then gets the TIR representation of the entire network after AOTExecutorCodegen, then the topology of the network should be reconstructable from the input output relations of the individual kernels, right?

areusch · February 3, 2022, 11:55pm

Hey @SebastianBoblestETAS sorry I missed your past message. in thinking about this a bit more, there are a couple of places this could be added.

Right now we generate a map ir_module_by_target in tvm.build. We export a similar thing in tvm.relay.build but it’s not present on all ExecutorFactory, so we can’t use it downstream.

Model Library Format currently will export the TIR representation of this using its own printer. If we had the IRModule from tvm.relay.build, we could also export the TIR representation as well (e.g. in src/tir-N.txt). Most ideally, we would first add the IRModule to tvm.relay.build output, and then modify the Model Library Format artifact to also contain that output.

Two caveats here:

Model Library Format can’t handle heterogeneous execution, so this might not work for you.
There might be some limitations with TVMScript cc @junrushao @shingjan but based on initial discussion with them we think this should work fine.

Maybe you could try doing the first bit and see if it’s useful to you and permits the approach you described above (that’s indeed what I was suggesting you do, analyze the TIR to construct the producer/consumer relationship between tensors). If so we could explore adding it to Model Library Format.

junrushao · February 4, 2022, 12:46am

If there is any bug in TVMScript, please feel free to open an issue with a minimal reproducible example