[DISCUSS] Module based Model Runtime Interface

When I read our post and try to summary, I find that we have this not to be decided. As previous post said, we don’t like GraphRuntimeFactoryModule. However, we need to decide the default value of packing parameters.

Hmm, would love to get people’s thoughts about param packaging. I think it is OK it leave it unpackaged(since it is the previous behavior) then we decide later if we want to turn it on by default

Let us keep it one day to hear thoughts of whether we should packing the weights by default. @haichen @jwfromm

BTW, if we don’t package weights, when we deploy , we will need two things : deploy.so / deploy.params. However, our API will encourage users : factorymod = relay.build(...) and when we call __iter__ we will warn, seems it breaks our constraint here.

You are right, let us package by default then. In most cases seems package params is the right approach

I agree package params by default is good since it’s likely that TVM transforms the params during the optimization

As a side note, for large models like VGG19, it might require a long time to compile the params into shared library. :frowning:

This problem should be fixed by my this pr: https://github.com/apache/incubator-tvm/pull/4657

1 Like

Sorry for the delay because of other things. I will file a formal RFC on GitHub this weekend (or early next week) based on our discussion.

1 Like

GitHub RFC: https://github.com/apache/incubator-tvm/issues/5038

Wish I don’t miss some part of so significant discussions we have completed. :slight_smile:

const unsigned char __tvm_dev_mblob[46788038] = {“TVM_BLOB_SIG”}; maybe not enough. because 46788038 is too big for many embedded system, so I have to place __tvm_dev_mblob to special section, for example, a rodata section. so I mean I need declare __tvm_dev_mblob as const unsigned char __tvm_dev_mblob[46788038] attribute((section(".rodata"))); the declaration grammar is compiler specific, put to which section is compiler specific too. so I think the RFC need to consider the case.

Thanks for respond.Finally, we don’t use this special hack. We will generate this directly using LLVM IR. And LLVM will put this into rodata section correctly.

Like this test:

Good solution! Thanks FrozenGene! but if we use LLVM, llvm series target can take advantage of this solution, I’m not sure if other targets such as cuda can use this solution.

CUDA also could use this. Because cuda’s target host is LLVM. As the example I show, it is in fact cuda target. So you could see NVIDIA NNVM Compiler in the constant string.

I got it. Thanks FrozenGene.

This won’t work by default for the C backend where we don’t necessarily rely on the presence of llvm or are we saying that there needs to be an llvm solution for the backend just to produce this constant data object always, so we do need a general solution …

Ramana

When we don’t have LLVM, we will fallback to our original way (call compiler to generate)

So, the problem hasn’t been fixed : there is a “solution” depending on the presence of an llvm target.

Ramana

I think I should clarify your question. Do you mean we should generate .rodata section of unsighed char __tvm_data_blob[]?

I wasn’t proposing that as a solution, that is one of the options. I’m merely stating that this is still a problem that will hit others most notably anyone using the C backend .

Ramana