Export TIR to json

SebastianBoblestETAS · March 16, 2022, 2:03pm

We propose to implement an API to TVM that allows users to export the TIR representation of the generated kernels as json.

Motivation

Json is a standard file format and can easily be processed in python, for example. A json interface allows to analyze the generated kernels target agnostic. For example, this works for both llvm and c targets.

The goal that we have is to be able to extract the full TIR representation of the host module and all device modules regardless of the specific executor, target or runtime.

In the case of the AOT executor pipeline, this would also contain the tvmgen_default___main__ function. This allows to reconstruct the graph topology of the neural network. For Graph executors, the graph can still be accessed via module.graph_json.

Proposed Implementation

We have already conducted some experiments with a JsonScriptPrinter class similar to the TVMScriptPrinter class in src/printer/tvmscript_printer.cc based on the discussion here: Why "c --unpacked-api" not work anymore? - #19 by areusch

We use it in driver_api.cc in the build function like this:

We make it accessible in python via json = module.lib.get_json() We do not know how to best export the TIR representation of the device modules. We would add a new boolean tir configuration option “tir.tir_export_to_json” to make this export functionality optional.

We think we can provide a working prototype within a few weeks but would appreciate feedback already now. @Khoi @MJKlaiber @UlrikHjort @areusch

Hzfengsy · March 16, 2022, 2:25pm

Thanks, @SebastianBoblestETAS. I agree that json is a great format for serializing, but I have a few questions:

What are the pros and cons of json format compared with TVMScript (if we have python env)
How to design a json format to store all TIR information for all possible nodes? Do you have examples to show it?

tqchen · March 16, 2022, 3:02pm

Thanks @SebastianBoblestETAS . TVM already have a reflection based mechanism to serialize IR nodes (includin TIR and other functions) as json format.

So it would be great to see if that functionality can be used directly for the usecase you have in mind.

SebastianBoblestETAS · March 16, 2022, 4:00pm

Hi,

thanks for the reference to this function. I was not aware of it. However I tried it on a Module(c) and got this: { "root": 1, "nodes": [ { "type_key": "" }, { "type_key": "runtime.Module" } ], "b64ndarrays": [], "attrs": {"tvm_version": "0.9.dev0"} } This looks like only a very small overview. What we would like to have is really the full representation that the code generators get to write the code. Do I miss an option for this function to get more detailed output?

tqchen · March 16, 2022, 5:56pm

runtime.Module itself is a runtime object, which does not yet support full serialization(due to the nature of some module not being able to serialize(e.g. DSO)). There was a discussion about Artifact class which could go into this direction.

If you take any IR object, say a tir.function or IRModule(that contains a TIR function) we will be able to use save_json to get the fully serialized format

areusch · March 17, 2022, 4:20pm

@SebastianBoblestETAS I think this could be helpful for folks consuming TIR outside of TVM itself. A couple of thoughts:

Right now a related problem is that there isn’t currently a way for tvm.relay.build to return the IRModules which contain TIR itself. You can see that in the BuildOutput struct. I believe here you tried to save the runtime.Module–these are distinct from the IRModule and don’t contain TIR. As a first cut, it would be great to add those to the BuildOutput (possibly behind an option) from where they are here. Note that tvm::build currently does return those IRModules and you can print TIR from them.
As @tqchen mentioned, you might be able to simply call tvm.ir.save_json on the IRModule once we have that returned from tvm.relay.build. Would this accomplish what you want?
I’m not sure whether the graph-level IRModule generated by AOT is included in lowered_funcs IRModule from my link earlier. You might need to check on that and return it from AOTExecutorCodegen.
Since TIR is also used internally in settings ranging from post-scheduling to pre-codegen, there are a few different possible forms of TIR. For example, once LowerTVMBuiltin pass runs, TIR gets significantly more complex as the details of the calling convention are included. If we’re going to commit a standard TIR export, we’ll need to choose a suitable setting from which that’s done. I’d suggest the simplest setting (e.g. before any TIR transforms are run), but unsure if that provides the level of detail you need.
We could also opt not to standardize and allow folks to configure the point in time at which the IRModule itself is extracted for returning or saved to JSON in the pipeline.

Anyway, there are a few different thoughts here; I’d be interested in your responses and perhaps we can come to a more specified design from there.

SebastianBoblestETAS · March 25, 2022, 4:17pm

You are right, we actually tried to save the runtime.Module. We actually also added the TIR export to the runtime.Module because the generated source code is also put to it. We will try the different options you proposed as soon as we can find the time.
In the meantime we have written an early prototype of our export. I attached a small sample image of what it looks like, compared to what save_json gives for the same IRModule. I do not think that save_json gives us the required level of detail but to be honest, I did not look at it in great detail.

We did not yet look into this, sorry
As you can probably see, we want to extract TIR just before the code generator is invoked. We would like to get all the details that also go into the final code generation step.
We want to make this useful for as many use cases as possible, so if we can make the point in time of the export configurable, we are definitely in favor of doing so. I am not sure however, how much this will affect the implementation of the TIR export printer. It might then possibly need configuration options as well, right?

MJKlaiber · May 11, 2022, 1:29pm

Thanks for the extensive answers and the hints.

I had a look at save_json(). However, I’m not 100% clear on all of the functionality.

Where are e.g. the NDARRAYs stored?

Given this example, I’d expect ndarrays to contain the values of the allocate_const node

from tvm.script import tir as T
import tvm
import json5

@tvm.script.ir_module
class DummyModule:
@T.prim_func
def main(a: T.handle, c: T.handle) -> None:
    T.func_attr({"global_symbol": "main", "tir.noalias": True})
    A = T.match_buffer(a, (10), "int32")
    C = T.match_buffer(c, (10), "int32")

    K1 = T.allocate_const([42, 1, 1, 1, 1, 1, 1, 1, 1, 1], "int32", [10])
    for x in T.serial(0, 10):
        C[x] = A[x] + K1[x]

irmodule = DummyModule
print(irmodule.script())
x = tvm.ir.save_json(irmodule)
x_json = json5.loads(x)
nodes = x_json["nodes"]
print(len(nodes))
ndarrays = x_json["b64ndarrays"]
print(len(ndarrays))

Results:

100

0

I also studied the C++ code https://github.com/apache/tvm/blob/main/src/node/serialization.cc#L86. It seems that ndarrays are dumped into the json file here.

@junrushao, @tqchen what am I missing? I this the expected behavior?

CC: @SebastianBoblestETAS @UlrikHjort @areusch @aca88

areusch · May 12, 2022, 8:45pm

I found the problem but I am unsure how to solve it as it seems like a fundamental problem in the implementation of Optional<T> cc @tqchen @junrushao @Hzfengsy @mousius @manupa-arm

I ran your code and I see this output from print("JSON", x):

JSON {
  "root": 1,
  "nodes": [
    {
      "type_key": ""
    },
    { // node 1
      "type_key": "IRModule",
      "attrs": {
        "attrs": "0",
        "functions": "2",
        "global_type_var_map_": "97",
        "global_var_map_": "96",
        "source_map": "98",
        "type_definitions": "95"
      }
    },
    {  // node 2
      "type_key": "Map",
      "data": [3, 5]
    },
    {  // node 3
      "type_key": "GlobalVar",
      "attrs": {
        "_checked_type_": "0",
        "name_hint": "4",
        "span": "0",
        "virtual_device_": "0"
      }
    },
    {  // node 4
      "type_key": "runtime.String",
      "repr_str": "main"
    },
    {  // node 5
      "type_key": "tir.PrimFunc",
      "attrs": {
        "_checked_type_": "91",
        "attrs": "85",
        "body": "12",
        "buffer_map": "83",
        "params": "6",
        "preflattened_buffer_map": "84",
        "ret_type": "81",
        "span": "90"
      }
    },
// ...
    { // node 12
      "type_key": "tir.AllocateConst",
      "attrs": {
        "annotations": "79",
        "body": "24",
        "buffer_var": "13",
        "data": "20",
        "dtype": "int32",
        "extents": "21",
        "irmod_storage_idx": "0",
        "span": "80"
      }
    },
// ...
    { // node 20 <-- BUG is here
      "type_key": "runtime.NDArray"
    },

The problem is evident in node 20 which should not exist. Node 20 is the result of NodeIndexer visiting AllocateConst via VisitAttrs. The bug is mabye inside VisitAttrs:

class AllocateConstNode : public StmtNode {
 public:
  /*! \brief The buffer variable. */
  Var buffer_var;
  /*! \brief The optional data associated to the constant.
   */
  Optional<runtime::NDArray> data;
  /*! \brief If the PrimFunc containing the Stmt is added to IRModule,
       this is an optional index to indicate the index within
       "Constants" attribute, that is a Array<NDArray> of IRModule.
   */
  Optional<Integer> irmod_storage_idx;
  /*! \brief The type of the buffer. */
  DataType dtype;
  /*! \brief The extents of the buffer. */
  Array<PrimExpr> extents;
  /*! \brief The body to be executed. */
  Stmt body;
  /*!
   * \brief Additional annotations about the allocation.
   *
   *  These annotations can be used as auxiliary hint
   *  to future transformations.
   */
  Map<String, ObjectRef> annotations;

  void VisitAttrs(AttrVisitor* v) {
    v->Visit("buffer_var", &buffer_var);
    v->Visit("data", &data);  // BUG: casts Optional<T> to Object* via Visit() overload. Reflection expects this to invoke Visit(const char* key, runtime::NDArray* value).
    v->Visit("irmod_storage_idx", &irmod_storage_idx);
    v->Visit("dtype", &dtype);
    v->Visit("extents", &extents);
    v->Visit("body", &body);
    v->Visit("annotations", &annotations);
    v->Visit("span", &span);
  }

Unfortunately because Optional<T> is templated, T is not available in the implementation of NodeIndexer::Visit(const char*, ObjectRef*), so this means that there’s no way to tell the difference between an ObjectRef that isn’t defined and an ObjectRef which is an Optional<T> (and, in this case, what’s T?).

I made an attempt to fix this, but the unit test fails because load_json doesn’t know that it needs to lookup the value of "data" in b64ndarrays. Not sure how to fix that…we could modify the serialization format or introduce an explicit OptionalNode to actually hold the NDArray and define T.

kparzysz · May 12, 2022, 9:08pm

TIR already has textual representation, so why not make sure it’s parseable, and then just write a parser for it?

tqchen · May 12, 2022, 9:14pm

Thanks @areusch .

This is indeed a bug. The cause of the matter was due to the fact that we specially handle NDArray serialization, but did not handle the case where the typing in c++ is ObjectRef(while the value itself should be NDArray).

It had not been a problem since before NDArray was not part of Object. But it appears as a bug after we unify the NDArray as a subclass of Object(and such possibility appears).

Fixing it would involve updating the serialization mechanism of ObjectRef to specifically handle NDArray(and Optional is not an issue here). Will dig a bit and get a resolution soon

areusch · May 12, 2022, 10:01pm

Yeah I mean honestly I think this is the proper solution long-term. The question is whether you should have to import tvm to parse tir, which is true now (e.g. via TVMScript). Also TVMScript does rely on Python AST parsing and that’s another thing which may not be desirable if we make TIR an exported language. At this point I was just trying to unblock the save_json approach in case TVMScript isn’t usable here, and ran into this bug.

areusch · May 12, 2022, 10:03pm

Not sure I follow how Optional<T> isn’t responsible for triggering this, but curious to see what you find.

vinx13 · May 12, 2022, 11:35pm

This is fixed in https://github.com/apache/tvm/pull/11303 We will use the new repr_bytes mechanism to save NDArray

MJKlaiber · May 16, 2022, 8:56am

Tested it. Looks good