[pre-RFC] TVM Explorer Infrastructure

chunit · September 30, 2022, 9:02am

Sorry for late reply. Now I am able to grasp the whole concept of Relay Var porposal much better. Thank you for your patience! We have some intuitive thoughts about it. But just like what you said, it deserves to have its own RFC if we want to introduce this tracing map. I would put the discussion of it at the end of this post.

Before that may I know would it be fine to prepare our PRs in this RFC if they look good to you? We can categorize the PRs to three independent parts:

Frontend span filling
Schedule recorder
Pass sapn filling*

Currently most of discussions are about the pass span part. We can continue our discussions for it, and at the same time, if frontend span filling and schedule recorder look good to you, we will prepare their PR and submit them recently. On the other hand, if pass span filling is a good enough midterm solution we can also submit its PR later. Finally, based on our conclusion, we can create a new RFC about the Relay Var tracing map. Would this plan look good to you?

About the Var tracing map, I think it is a good mechanism. Because we can always find where is an IR expression from. Based on this idea we try to find what obstacles we need to break through. To me it is really a challenging topic. I totally agree to make a new preRFC for it. Haha

Data structure and what function to be called

Tracing map should be a <var, Array<var>> form.

To serve those n-to-n (n=>1) conversion, we need an array to preserve their relations.
IRModule includes the historical map and functions during transformation

Therefore it might look like:
Var_Map1127×450 12.6 KB
SIBuilder might not be necessary in this scenario

Since we could get expression mapping relationship through traversing tracing map. We can assign the span to an expr directly, no need to find the input/output of a transformation expression.

Obstacles we might encoutner

We might need to construct a new data sturcture according to the index of var.

I haven’t fully read the Doc Printer. But if there is an example look like this:
```
@fn () {
    %0 = ...
}
@main () {
    %0 = ...
}
```
Then we need to make our map be able to recognize which %0 we are talking about.
Annotating original expr to the transformed expr is time consuming

Basically it seems to me that this is the most doable way, but it is almost the same as what we are doing for the span filling. It would not be automatic enough, but at least it might be more easily to achieve.
Modify mutate_ of Mutator, Rewriter would invoke a big number of changes.

Almost all passes inherit from the Mutator or Rewriter, we would need to check them carefully.
Difficulty of make an analyzing pass

So far I have not figured out a workable method. It becomes hard to do analysis for those multiple source/results pass.
Should be aware of the performance impact:

Once we have a sequence of maps, and original Relay functions. We need to do a map traversing for each of expr in the end. The time complexity would be O(N*M), N is the number of expr and M is the number of maps.

That’s all we can come up currently. For the long term solution, we think a tracing map would be a necessary mechanism. Yet it should be planned carefully in case we encounter too much trouble. Currently the pass span filling can provide a roughly mapping after transformation. Perhaps we can still consider using this feature for now, and try to complete the tracing map for a better result.

Thank you again for reading this. We will stay tune with you!

areusch · October 3, 2022, 3:51pm

Here’s some replies to the first part of your post. I’ll get back to the rest of it in a few days here.

Go for it!

Yeah that sounds great to me. Apologies for derailing this around the Var tracing proposal.

chunit · October 5, 2022, 12:32am

No problem

We will start from the frontend span filling. Based on comments, span for parameters will be added. Once finish, we will submit the PR of each frontend one by one. Thank you!

zhaoyang-star · October 20, 2022, 4:05am

Great job! I am interested in the feature. How long will the feature be available for us?

chunit · October 21, 2022, 3:05am

Hi zhaoyang-star,

Thank you for your compliment.

We have asked our legal support, and things are complicated as we mentioned above. So far we cannot provide a precise date for it. Yet at least I would say it will happen after the features in this RFC are ready and stable in the TVM main branch.

lixiaoquan · November 25, 2022, 1:23am

Hi chunit,

I have to say the PART part is very useful. In daily debug, we need to trace relay IR back to original model frequently, so keeping the output name the same as the original name in model make it much easier. Without that, we have to always change original name to original_OUTPUT.

chunit · November 29, 2022, 2:33am

Hi @lixiaoquan .

Thank you very much for giving us advice again! I agree with you that it is quite important to indicate which relay expression generates the output tensor of its original frontend layer. But we encounter some problems and we remove the suffix string totally. Therefore we would like to ask for advices in this RFC. So far we have two possible ways to handle this issue.

Like what you said, patch a suffix to indicate to the expression which generates the output tensor.
Leverage the var thing said by @areusch to handle to tag the output result.

The most straightforward way should be the first one, adding the suffix. And perhaps it is an acceptable compromise. In the following phrase I would like to detail the problem and the solution 1. Because solution 2 still needs time to design. Note that we have pushed the very first PR of this RFC. It would be great if you have time to take a look at it.

Problem

In the previous version we set the _PART_{idx} as suffix. However, once we process to pass transformation, this suffix becomes annoying, and really hard to deal with. Even worse, after invoking several passes, these suffixs seems to be meaningless.

Solution1

Adding a suffix _OUTPUT to indicate we are generating the output expression of a frontend layer. To be more precisely here is the modfication we can make in our PR:

Now we invoke set_span in several place for the parameter sources. Those frontend source will be converted to Constant or Var, and no need to indicate output. Therefore we would add one more parameter to the common API:
```
def set_span(expr, span, is_output)
```

Control is_output flag to make the final epxrssion be tagged like SOURCE_NAME_OUTPUT. Take the TFLite OP for example again. It would look like:

def @main (%input: Tensor[(?, ?, 3, 1), float32]) {
    %0 = shape_of(%input, dtype="int32") /* Shape_OUTPUT */;
    %1 = strided_slice(%0, …) /* strided_slice */;
    %2 = squeeze(%1) /* strided_slice_OUTPUT */;
    %3 = expand_dims(%2, axis=0) /* stack */;
    %4 = expand_dims(3, axis=0) /* stack */;
    %5 = expand_dims(3, axis=0) /* stack */;
    %6 = (%3, %4, %5) /* stack */;
    %7 = concatenate(%6) /* stack_OUTPUT */;
}

We could consider to set the span of output expr like “stack_OUTPUT” or “stack FINAL_OUTPUT”. Or make one more parameter to let user customize it. Not sure whether it is good to spec the string or not.

Personally I would like to set the is_output to False as default. The reason is that most of time, we will then process the build command, or leverages the pass transformation. At this stage the OUTPUT suffix could become meaningless as mentioned above. We can write some more documentation at both set_span API, and each frontend conversion to tell user where and when to turn this parameter on. So that user will have a flexibility to obtain the more precise result (the output location), and will not be confused when approach to the pass transformation or even the lowering process.

Thank you for reading such long description. Hope it could provide you more context. It is great to have a conversion about issue.

lixiaoquan · December 6, 2022, 2:51am

Thank you very much for the explaination. I agree suffixes may be hard to handle, but I am just wonder why these suffixes become meaningless after passes. At that stage, we still need them to trace back to the original model. I agree after some transformation, some of ops may be eliminated and they may be grouped to different fused functions. But we still can have a clear view about where those OPs come from.

I think just because IR may be changed a lot because of DynamicToStatic/Const Folding/…, those spans become even more important after transformation. It sometimes has more benifits, that when I saw an op without a span, I can guess it it is generated by a pass, not from original model.

chunit · December 7, 2022, 1:32am

Hi @lixiaoquan

Thank you very much for the explanation.

You’re welcome, it’s really great to have a discussion about this issue.

It sometimes has more benefits, that when I saw an op without a span, I can guess it it is generated by a pass, not from original model.

Just one more thing before a long, and detailed description about removal of suffix. About the span after processing a pass, in the third part of this RFC we aim to fill span to those Relay passes which are involved in build flow. Therefore you could expect if everything of this RFC is going well, we could have span tagged result after invoking certain passes.

Perhaps in the following I could try to describe the removal of suffixes from several aspects. Hope it could provide you more reasons about why we want to remove them all in the end.

Reference to a related work

Take the compiler explorer for example, you can see most functions in the high level language(C++) is converted to a sequence of lower level functions(assembly). It’s not necessary to tag a suffix to generated lower level functions. In this scenario, knowing a high level language is converted to which set of level functions is enough. Similarly, the suffixes might not be necessary in Relay, because knowing a frontend model layer is converted to which set of Relay exprs might be enough for the first glance.

How many benefits can we get from suffix?

It is somehow the most important reason in my point of view personally. Suppose what we focus on is “frontend → Relay”, and what would happen if we keep the suffixes after invoking the build() or optimize()? There might be a span-tagged expr like this:

%0 = expr(...) /* span=other_expr_PART_3 */

If user does not check the very first Relay module, which is generated by the API from_{frontend}(), the _PART_ means nothing to them. (The only thing they know is, ok, the generation of this expr is related to “other_expr”, I don’t know the meaning of PART). Even worse, it might be confusing to them when seeing the suffix, like, where can I find the PART? Therefore, if we focus on the conversion between “frontend → Relay” only. It is not a good choice to use suffix when passes are involved.

On the other hand, the value of suffix might be a special indicator of an expr between passes, which means “Relay->Relay”. However, the more we try to enhance the debugging ability, the more we find “sourceName” might not be the best choice for it. For example, suppose I have an IR, which has the original form like this:

# IR1
fn() {
  %0 = reshpae(..) /* span=reshpae_add_PART_1 */
  %1 = add() /* span=reshape_add */
}

Then I invoke a pass, which converts add call to subtract call:

# IR2
fn() {
  %0 = reshpae(..) /* span=reshpae_add_PART_1 */
  %1 = subtract() /* span=reshape_add */
}

Finally I invoke a pass to merge these two expr to an op which defined by myself:

# IR3
fn() {
  %0 = my_expr() /* span=[reshape_add_PART_1, reshape_add] */
}

You can see, well, my_expr call is from the reshpae_add_PART_1 and reshpae_add. However, it seems to be the relation between “IR3 and IR1”. When I wnat to compare the relation between “IR3 and IR2”. I have to rely on the legacy (sourceName of span) from IR1, which might be confusing by its literal meaning. The better choice in this scenario might be using the line, column information. So that the relationship between IR3 and IR2 can be more straightforward. Yet there are still something wait to be delt if we want to use the line, column information. Like, how to define the line number of an IR module.

In summary, in the conversion of “frontend-> Relay”, suffix seems to be not necessary. On the other hand, in the conversion of “Relay → Relay”, use suffix is not straightforward enough.

Any workaround?

If suffix is something really hard to give up. Perhaps a post-processing pass can be a workaround for it. For example, a pass traverses the whole Relay IR, and tag _PART_ to those spans with the same SourceName. With the help of this pass, we can have a very similar result like the previous reverted PR.

Thank you for reading such a long reply. It feels great to have a discussion with you about this topic.

lixiaoquan · December 8, 2022, 1:42am

Hi chunit,

It seems your explorer can find the boundary of a group of IRs from a layer, that is wonderful. Does it use the same Span to determine a group? If so, I guess there is a quick searching to find all the IRs with same Span.

If we need to detemine the group by eyes, we have to do a quick search too.

Is it possible to expand span a little bit to indicate it is an “Output”? So the original source doesn’t have to be changed with any tag and it will not mess up other parts of your design. And that “output” field can be printed in text for quick comparation between relay IR and original model(just for human).

def @main (%input: Tensor[(?, ?, 3, 1), float32]) {
    %0 = shape_of(%input, dtype="int32") /*Shape*/ /* Shape_OUTPUT */;
    %1 = strided_slice(%0, …) /* strided_slice */;
    %2 = squeeze(%1) /* strided_slice*/ /* strided_slice_OUTPUT*/;
    %3 = expand_dims(%2, axis=0) /* stack */;
    %4 = expand_dims(3, axis=0) /* stack */;
    %5 = expand_dims(3, axis=0) /* stack */;
    %6 = (%3, %4, %5) /* stack */;
    %7 = concatenate(%6) /* stack */ /* stack_OUTPUT */ ;
}

def @main (%input: Tensor[(?, ?, 3, 1), float32]) {
    %0 = shape_of(%input, dtype="int32") /*Shape*/ /* Shape_OUTPUT */;
    %1 = strided_slice(%0, …) /* strided_slice */;
    %2 = squeeze(%1) /* strided_slice*/ /* strided_slice_OUTPUT*/;
    %3 = expand_dims(%2, axis=0) /* stack */;
    %4 = expand_dims(3, axis=0) /* stack */;
    %5 = expand_dims(3, axis=0) /* stack */;
    %10 = fn(%001, %002, %003) {}
       %6 = (%001, %002, %003) /* stack */;
       %7 = concatenate(%6) /* stack */ /* stack_OUTPUT */ ;
    }
    %11 = %10(%3, %4, %5)
}

In case like that, when IRs from a layer are put in different functions, we can still find output of stack quickly.

chunit · December 8, 2022, 7:53am

Hi @lixiaoquan

Thanks for fast reply!

Does it use the same Span to determine a group?

About the Explorer, yes we group a set of exprs based on the span. The exprs with the same span have the same color. When hovering the mouse over, the set of exprs become a bit darker. Besides, our UI provides search functionality too. For your reference, the following picture could demonstrate the behavior:

Is it possible to expand span a little bit to indicate it is an “Output”?

Thank you for point out this way to deal with the problem.

Should be doable, we can achieve this by modifying the attributes of Span and the relay text printer. The attribute of Span would become like this:

class SpanNode : public Object {
 public:
  SourceName source_name;
  int line;
  int column;
  int end_line;
  int end_column;
  bool fronted_output;
  //...
  }

Yet I’m not pretty sure whether it would be a better choice than the way I mentioned in here, which aims to let advanced users to configure on their own. The main reason in my point of view is that, Span seems to be designed for “Expr”, the parent class of “RelayExpr”. Take a look at the relationship of the class BaseExprNode. We can see this base class derives RelayExprNode and PrimExprNodelater. It might be too specific to add one more attribute for indicating whether the tagged expr is corresponding to output of its frontend layer.

class BaseExprNode : public Object {
 public:
  /*!
   * \brief Span that points to the original source code.
   *        Reserved debug information.
   */
  mutable Span span;
  static constexpr const char* _type_key = "BaseExpr";
  //...
};
class RelayExprNode : public BaseExprNode {...}
class PrimExprNode : public BaseExprNode {...}

Thanks for reading again. If I miss something, please don’t hesitate to let me know.

lixiaoquan · December 9, 2022, 1:23am

Hi chunit,

I understand the point Span is not only for Relay. I think Span is designed for relay text format in the very beginning, because it has source_name, line, colume, and there is a AnnotateSpan() pass to get line and colume. It has already been a compromise to fill layer name to source name.

chunit · December 13, 2022, 9:16am

Hi @lixiaoquan

It has already been a compromise to fill layer name to source name.

I see, I had read the comments in your PR before. Personally, I like the choice of using the layer as source name, because we cannot always get the source file name of a neural network model. Somehow, I prefer layer name than the source file name .

I will try to summarize what we have discussed in the following. I believe we can find a way to keep the generality and also get output information.

Back to topic, currently we have two choices to mark the converted expr which is corresponding to the frontend output.

1. Extend the object Span: Add an attribute

As you mentioned in the reply, we can add an attribute to mark the span is the output of a frontend layer. The implementation would be like:

class SpanNode : public Object {
 public:
  SourceName source_name;
  int line;
  int column;
  int end_line;
  int end_column;
  bool fronted_output;
  //...
  }

The pro of this method is very straightforward. We can obtain the output information by only modifying the attribute of span, and the relay text printer.

The con of it is that, this attribute is too specific for the frontend → relay conversion.

2. Extend the Mutator `SpanFiller`: Add a user defiended callback

I have a disscussion with my team members about how to make it more generic. The following is the outline of what we thought:

SpanFiller is the core functionality of set_span. We might implement an interface for SpanFiller. Then SpanFiller invokes it to rewrite the source name. User should override interface to achieve the behavior they want.

To be more precise, the SpanFiller might become like the following:

class _SpanFiller(ExprMutator):
    """SpanFiller"""

    def __init__(self, span, annotator...):
        #...
        self._annotator = annotator

There are several ways to pass the annotator to SpanFiller, because set_span is invoked in internal part of the from_{frontend} API. We might obtain this annotator via accessing to the PassContext, change the from_{fornted} interface, and so on. Here we take the PassContext way as example:

my_annotator = FrontendSpanAnnotator(...)
with tvm.transform.PassContext(config={relay.frontend.span_annotator=my_annotator}):
    mod, param = from_{frontend}(...)

About the annotator, it might look something like this:

# Interface class
class FrontendSpanAnnotatorBase:
    # source_str is obtained from SpanFiller
    def annotate_source_name(self, source_str):
        raise NotImplementedError()

    def generate_span(self):
        return tvm.relay.Span(
            self.annotate_source_name(source_str), 0, 0, 0, 0
        )

User should implement the interface. Then SpanFiller will use this annotator to change the source name of current span. Take the visitor of var for example. It would be like:

def visit_var(self, var):
    if self._annotator:
        return _expr.VarWithFields(..., self._annotator.generate_span())
    return _expr.VarWithFields(var, var.vid, var.type_annotation, None, self._span)

It’s easy to tag the suffix “_OUTPUT” to the source name with the annotator. The pros of this method are first, it provides users high flexibility to define their own annotating method. second, it also keeps the generality of the Span object. The con, of course, it requires users a fundamental knowledge to make the annotator.

These two methods are what we have so far. Thank you for discussing this issue with us. Now we have more ideas about how to tackle this problem .

One more thing, because our first PR is almost done, and the changes of that PR do not change the behavior of current _set_span. Therefore, we would like to keep that PR unchanged. If the proposal here looks good to you. We can submit one PR to add the functionality we talked in here. Would it be fine to you?

Thank you for reading. Any comment is appreciated.

lixiaoquan · December 14, 2022, 9:14am

Hi chunit,

I appreciate that you can consider my comment.

Personlly I perfer to 1), because that is a minor change and we can still make use of TVM Explorer. If source_name is changed, the changed span may not be able to work with TVM Explorer.

chunit · January 3, 2023, 7:19am

H @lixiaoquan

Sorry for late reply. Thanks a lot for keeping discussing with us!

Personally I prefer to 1), because that is a minor change

I see. I agree, method 1) is pretty straightforward. we are fine with both methods. Yet personally I feel method 2) could be a better foundation for any customized requirement. @areusch would you mind to give us some more suggestions in the point of maintainer’s view? Thanks

we can still make use of TVM Explorer. If source_name is changed, the changed span may not be able to work with TVM Explorer.

About the Explorer, both methods could fit into Explorer, and both methods require Explorer to add some more text parsing functions. The effort is the same.

gfvvz · September 11, 2024, 7:42am

@chunit does this tool opensourced?

the code base of TVM Explorer is maintained in another git repository and not included in this RFC

[pre-RFC] TVM Explorer Infrastructure

Data structure and what function to be called

Obstacles we might encoutner

Problem

Solution1

Reference to a related work

How many benefits can we get from suffix?

Any workaround?

1. Extend the object Span: Add an attribute

2. Extend the Mutator SpanFiller: Add a user defiended callback

2. Extend the Mutator `SpanFiller`: Add a user defiended callback