[pre-RFC] TVM Explorer Infrastructure

Wow, such excellent work! Always want some interactive debugging feature like this when playing around with TVM. You guys make it come true! Looking forward to the release :clap:

2 Likes

Cool, thanks for the explanations!

The Var thing I’m discussing here is not exactly a simple tweak to this proposal–it’s probably significant enough lift that it would deserve its own RFC. So just to clarify–I’m not necessarily asking you to change your approach. However, I did want to raise this question to a) build more support for the idea, b) see if it is potentially easier to pursue than adding SIBuilder support to the remaining passes, and c) think through whether it’d be easier to maintain in the long run.

The basic idea is like so: consider your one-to-many example conversion. A common challenge we face in TVM is determining which Relay Expr correspond to one another before and after a pass. To choose a concrete example, suppose we introduce a pass which outlines part of a function (suppose it outlines Pack from your previous example). Before executing the pass, suppose we start from your example:

Now suppose we run the outliner, and arrive at:

def @outlined_pack(%i1) {
  %0 = expand_dims(%i1, axis=0) /* stack */;
  %1 = expand_dims(3, axis=0) /* stack */;
  %2 = expand_dims(3, axis=0) /* stack */;
  %3 = (%0, %1, %2) /* stack */;
  %4 = concatenate(%3) /* stack */;
  %4
}

def @main (%input: Tensor[(?, ?, 3, 1), float32]) {
    %0 = shape_of(%input, dtype="int32") /* Shape */;
    %1 = strided_slice(%0, …) /* strided_slice */;
    %2 = squeeze(%1) /* strided_slice */;
    # the Pack Op conversion start from here
    %3 = @outlined_pack(%2);
    %3
}

Now the question here is: after running the pass, does a new Relay var exist which contains %7? The answer is yes: it’s %7. In order to make this outline, an e.g. ExprMutator needed to capture the subgraph that contains %3 through %7, then replace it with a call to the new function and store the result in %3. This pass knows that %3 == %7, and (similarly to how Span information is filled here) when defining %3, could include some type of backreference to %7. This could even just be included as a Map:

using VarMap = Map<Var,Var>;  // keys are originally-imported Var, values are the equivalent now inside f.
Function f = mod.GetFunction("main");
f->GetAttr<VarMap>("var_map");

This approach could be taken all the way back to the original import (e.g. or there could be an additional map from input framework layer to Relay var).

SIBuilder takes as input a set of Expr which bound the subgraph. Since most Relay programs are transformed in A-Normal form, the VarMap could substitute for these Expr. This won’t work for all optimizations, but I think for a decently large class of them, we could automatically apply SIBuilder by walking VarMap and applying Spans to the subgraphs with endpoints in VarMap. The advantage of this technique is that it could also be done with TIR with the same approach.

I think you’d need to assert that the Relay or TIR graph could be partitioned along VarMap for this to work–so I’m not saying it would work for all transforms. But I do think it would work for many. It’s also worth noting that this is a best-effort tracking scheme–it’s possible through e.g. operator fusion that some Vars could simply be eliminated. In these cases, the VarMap may not contain all Var from the original model

Thanks for providing this data! It seems reasonable as part of running with a debug option at least!

1 Like

Thank you for this detailed explanation! We digest the content and try to apply this concept to an existing pass. There are still many implementation details we have not figured out. Yet the following is how we illustrate the var mechanism should be like. Please kindly help us if we misunderstand anything. :smiley:

Goal

Implement a pass to construct a graph. The graph is a tracing map to record the transformation before and after a pass.

What the map should looks like

Personally I would prefer the key are the f, the new equivalent now, and value are the original var. It should be more convienent for us to trace back to the source. So it should be like:

Map<Var,Var>
// Keys are the equivalent now inside f
// Values are originally-imported Var.

Because after a sequence of pass transformations, we would have a final IRModule. Select a certain expression in the final IRModule[“main”], we can trace back to the source. If we use the the originally-imported Var as Key. Perhaps we have to iterate through all the map to find the resulted Var after transformations.

How to invoke

Considering the function GetPassPrefix in “src/relay/backend/utils.cc” we insert a pass OutLiner between passes:

//...
pass_seqs.push_back(transform::SimplifyInference());
pass_seqs.push_back(OutLiner);
pass_seqs.push_back(transform::EliminateCommonSubexpr(fskip));
pass_seqs.push_back(OutLiner);
pass_seqs.push_back(transform::SimplifyExpr());
pass_seqs.push_back(OutLiner);
//...

Process looks like

Take the Relay Pass, SimplifyInference for example, it unpacks certain Calls like batch norm op. The following image is a part of result after the transformation of SimplifyInference pass in our Explorer.

It takes the batch_norm call and its tupleGeItem as source exprs and unpacks them to a set of basic operations.

Now the following is the process once we introduce the OutLiner pass:

Back to the IR pretty print, we would start from IR[“main”] here:

def main(...) {
  %0 = nn.conv2d(%input, %model.conv1.weight,...) /* si=torch._convolution_3 */;
  %1 = nn.batch_norm(%0,...) /* si=torch.batch_norm_8 */;
  %2 = %1.0 /* si=torch.batch_norm_8 */;
}

After the SimplifyInference the IR[“main”] becomes:

def main(...) {
  %0 = add(%model.bn1.running_var, 1e-05f) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %1 = sqrt(%0) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %2 = divide(1f , %1) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %3 = multiply(%2, %model.bn1.weight) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %4 = nn.conv2d(%input, %model.conv1.weight,...) /* si=torch._convolution_3 */;
  %5 = expand_dims(%3, axis=1, num_newaxis=2) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %6 = negative(%model.bn1.running_mean) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %7 = multiply(%6, %3) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %8 = add(%7, %model.bn1.bias) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %9 = multiply(%4, %5) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %10 = expand_dims(%8, axis=1, num_newaxis=2) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %11 = add(%9, %10) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
}

Now it is the time to invoke OutLiner. It generates another global function, outlined_bn_0.

def main(...) {
  %0 = nn.conv2d(%input, %model.conv1.weight,...) /* si=torch._convolution_3 */;
  %1 = @outlined_bn_0(%0,...)
}

def outlined_bn_0(%i1...) {
  %0 = add(%model.bn1.running_var, 1e-05f) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %1 = sqrt(%0) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %2 = divide(1f , %1) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %3 = multiply(%2, %model.bn1.weight) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %4 = expand_dims(%3, axis=1, num_newaxis=2) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %5 = negative(%model.bn1.running_mean) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %6 = multiply(%5, %3) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %7 = add(%6, %model.bn1.bias) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %8 = multiply(%i1, %4) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %9 = expand_dims(%7, axis=1, num_newaxis=2) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %10 = add(%8, %9) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
}

#Perhaps we would need the original main as reference
def main_before_SimplifyInference_0(){
  #...
}

On the same time, we maintain our the tracing map like this (Key and value should be a Var, yet I am not pretty sure show to exress them in a Var form).

# key: transformed result
# values: original things
map = {
    hash(outlined_bn_0): {%1-batch_norm, %2-%1.0}
}

Using the graph constructed by tracing map, we should be able to trace an IR back to its very original form. Perhaps the functionality of OutLiner might be Implemented based on StructuralEqual. But we haven’t come up a good idea for this currently. Still, if this OutLiner is Implementalbe, it will be really convenient. :smiley:

Questions

In here we come up some questions about this strategy:

  1. What IRModule would be used once the OutLiner is invoked? Should be IR1 but not the IR2, right?
  • IR1
def main(...) {
  %0 = nn.conv2d(%input, %model.conv1.weight,...) /* si=torch._convolution_3 */;
  %1 = @outlined_bn_0(%0,...)
}

def outlined_bn_0(%i1...) {
  %0 = add(%model.bn1.running_var, 1e-05f) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %1 = sqrt(%0) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %2 = divide(1f , %1) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %3 = multiply(%2, %model.bn1.weight) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %4 = expand_dims(%3, axis=1, num_newaxis=2) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %5 = negative(%model.bn1.running_mean) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %6 = multiply(%5, %3) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %7 = add(%6, %model.bn1.bias) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %8 = multiply(%i1, %4) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %9 = expand_dims(%7, axis=1, num_newaxis=2) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %10 = add(%8, %9) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
}
  • IR2
def main(...) {
  %0 = add(%model.bn1.running_var, 1e-05f) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %1 = sqrt(%0) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %2 = divide(1f , %1) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %3 = multiply(%2, %model.bn1.weight) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %4 = nn.conv2d(%input, %model.conv1.weight,...) /* si=torch._convolution_3 */;
  %5 = expand_dims(%3, axis=1, num_newaxis=2) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %6 = negative(%model.bn1.running_mean) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %7 = multiply(%6, %3) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %8 = add(%7, %model.bn1.bias) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %9 = multiply(%4, %5) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %10 = expand_dims(%8, axis=1, num_newaxis=2) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
  %11 = add(%9, %10) /* si=[ torch.batch_norm_8, torch.batch_norm_8 ] */;
}
  1. If we choose the IR1, and continue the transformations of the rest of passes. It might end in a nested form. The readiblity should become very terrible. Perhaps a unpack pass for outlined_fn is requried too, right?

  2. Still about the nested form, if we use the nested form like IR1, many pattern matching things may need to rewrite, because now they need to check the outlined_fn in the graph. The complexity of Implement a pass might increase.

Thank you for reading such long post. it feels great that we can try to figure a better way to maintain the source information. :smiley:

1 Like

Sure thing–I think you broadly understand my proposal. Let me clarify some things:

It could be a pass or it could be some other way (e.g. modify Expr constructor). The tracing map is the goal, though.

That seems reasonable, so long as the model is always in A-Normal Form. If it isn’t then we may need Map<Expr,Expr> here. I think this was stated earlier, just reiterating.

This could also be handled by PassManager, but yeah that’s the right idea, if we took a pass-based approach here. I’ll sketch some ideas I have below.

This is pretty close to my suggestion, but let me tweak it slightly. The goal here would be to map a Var in the final Relay or TIR representation to a Var that represents it in the original program (assume the original program is expressed in A-Normal Form, and suppose we allow for trivial TupleGetItem Expr in this map, so %0.2 is a valid value). After running this pass, the map here would then look like:

# key: transformed Var
# values: Expr representing the original value
# keys not present where no mapping exists
map = {
    %input: %input,
    %model.conv1.weight: %model.conv1.weight,
    ...  # same for the rest of the inputs (not as trivial if the keys were instead TIR Var)
    %4: %0,  # I think I understood this transform properly, I think the reordering is due to A-Normal Form conversion after the rewrite, but that in the final program, %4 doesn't depend on %0, %1, %2, %3
    %1: %2  # or %1.0, if that was the only such representation of this.
}

Given this map, the Expr that could be used with SIBuilder then are just the keys of the map.

I think you could then implement a fairly simple algorithm to apply SIBuilder:

  1. Invert the variable map (swap keys and values).
  2. Step through the original program, and for each Relay Expr:
    1. Identify the inputs and outputs (this is akin to building a connectivity graph in the final program, but we sort of get it for free from the original)
    2. Lookup those values in the resultant program using the Map
    3. Create SIBuilder with span equal to the Relay Expr. Run RecursivelyFillSpan(outputs, inputs).

I haven’t thought about this enough, but I think this could run into some limitations maybe around loops and control flow, particularly if we apply the same approach to TIR. I’d need to think about it a bit further.

Building the map

As for how to build this map, here are some thoughts:

  1. Modify Expr() constructor to take another arg Expr orig_expr. Modify all passes to pass orig_expr.
  2. Change ExprMutator and kin to accept such a Map (or get it out of an IRModule attr). When Mutate_ returns a Node different than the one passed-in, modify the map.
  3. Attempt to derive this from an analysis pass as you mentioned.

I think #1 or #2 may not cover all cases here, and some passes may also need to be updated. The reason I’m raising this here is it seems like equivalent work to track relationships between Vars, and if it was possible to get away with using that work to label Spans, we might be able to do this once. Finally, I’m thinking about how to apply SIBuilder to LowerTE, which is what generates TIR for Relay, and how to preserve that information when doing MetaSchedule-style transforms in a TensorIR world. It seems a bit more straightforward to propagate this Var relation rather than the Span info. Var tracking can also be useful in AOT for:

  • Identifying which TIR Vars represent which Relay Expr (e.g. implementing GraphExecutorDebug)
  • Profiling layers run in TIR, using those Vars as a hint for where a layer’s compute starts and stops.

Anyway, here I am curious to hear your thoughts on whether you think we could leverage this for Span annotations. The work here is helpful for the project either way, so I think we could also merge this now and, if we can improve the maintanability via Var tracking, we could make that improvement as a follow-on.

cc’ing some other folks who have been thinking about this at octo: @anwang @AndrewZhaoLuo @mbaret @mehrdadh

1 Like

This is awesome and super helpful work. Can’t wait to use it.

1 Like

Hi @areusch

Sorry for late reply. Now I am able to grasp the whole concept of Relay Var porposal much better. Thank you for your patience! :smiley: We have some intuitive thoughts about it. But just like what you said, it deserves to have its own RFC if we want to introduce this tracing map. I would put the discussion of it at the end of this post.

Before that may I know would it be fine to prepare our PRs in this RFC if they look good to you? We can categorize the PRs to three independent parts:

  1. Frontend span filling
  2. Schedule recorder
  3. Pass sapn filling*

Currently most of discussions are about the pass span part. We can continue our discussions for it, and at the same time, if frontend span filling and schedule recorder look good to you, we will prepare their PR and submit them recently. On the other hand, if pass span filling is a good enough midterm solution we can also submit its PR later. Finally, based on our conclusion, we can create a new RFC about the Relay Var tracing map. Would this plan look good to you?


About the Var tracing map, I think it is a good mechanism. Because we can always find where is an IR expression from. Based on this idea we try to find what obstacles we need to break through. To me it is really a challenging topic. I totally agree to make a new preRFC for it. Haha

Data structure and what function to be called

  1. Tracing map should be a <var, Array<var>> form.

    To serve those n-to-n (n=>1) conversion, we need an array to preserve their relations.

  2. IRModule includes the historical map and functions during transformation

    Therefore it might look like:

  3. SIBuilder might not be necessary in this scenario

    Since we could get expression mapping relationship through traversing tracing map. We can assign the span to an expr directly, no need to find the input/output of a transformation expression.

Obstacles we might encoutner

  1. We might need to construct a new data sturcture according to the index of var.

    I haven’t fully read the Doc Printer. But if there is an example look like this:

    @fn () {
        %0 = ...
    }
    @main () {
        %0 = ...
    }
    

    Then we need to make our map be able to recognize which %0 we are talking about.

  2. Annotating original expr to the transformed expr is time consuming

    Basically it seems to me that this is the most doable way, but it is almost the same as what we are doing for the span filling. It would not be automatic enough, but at least it might be more easily to achieve.

  3. Modify mutate_ of Mutator, Rewriter would invoke a big number of changes.

    Almost all passes inherit from the Mutator or Rewriter, we would need to check them carefully.

  4. Difficulty of make an analyzing pass

    So far I have not figured out a workable method. It becomes hard to do analysis for those multiple source/results pass.

  5. Should be aware of the performance impact:

    Once we have a sequence of maps, and original Relay functions. We need to do a map traversing for each of expr in the end. The time complexity would be O(N*M), N is the number of expr and M is the number of maps.

That’s all we can come up currently. For the long term solution, we think a tracing map would be a necessary mechanism. Yet it should be planned carefully in case we encounter too much trouble. Currently the pass span filling can provide a roughly mapping after transformation. Perhaps we can still consider using this feature for now, and try to complete the tracing map for a better result.

Thank you again for reading this. We will stay tune with you! :slight_smile:

2 Likes

Here’s some replies to the first part of your post. I’ll get back to the rest of it in a few days here.

Go for it!

Yeah that sounds great to me. Apologies for derailing this around the Var tracing proposal.

1 Like

No problem :smiley:

We will start from the frontend span filling. Based on comments, span for parameters will be added. Once finish, we will submit the PR of each frontend one by one. Thank you!

Great job! I am interested in the feature. How long will the feature be available for us?

1 Like

Hi zhaoyang-star,

Thank you for your compliment. :smiley:

We have asked our legal support, and things are complicated as we mentioned above. :frowning: So far we cannot provide a precise date for it. Yet at least I would say it will happen after the features in this RFC are ready and stable in the TVM main branch.

Hi chunit,

I have to say the PART part is very useful. In daily debug, we need to trace relay IR back to original model frequently, so keeping the output name the same as the original name in model make it much easier. Without that, we have to always change original name to original_OUTPUT.

Hi @lixiaoquan .

Thank you very much for giving us advice again! I agree with you that it is quite important to indicate which relay expression generates the output tensor of its original frontend layer. But we encounter some problems and we remove the suffix string totally. Therefore we would like to ask for advices in this RFC. So far we have two possible ways to handle this issue.

  1. Like what you said, patch a suffix to indicate to the expression which generates the output tensor.
  2. Leverage the var thing said by @areusch to handle to tag the output result.

The most straightforward way should be the first one, adding the suffix. And perhaps it is an acceptable compromise. In the following phrase I would like to detail the problem and the solution 1. Because solution 2 still needs time to design. Note that we have pushed the very first PR of this RFC. It would be great if you have time to take a look at it. :slight_smile:

Problem

In the previous version we set the _PART_{idx} as suffix. However, once we process to pass transformation, this suffix becomes annoying, and really hard to deal with. Even worse, after invoking several passes, these suffixs seems to be meaningless.

Solution1

Adding a suffix _OUTPUT to indicate we are generating the output expression of a frontend layer. To be more precisely here is the modfication we can make in our PR:

  1. Now we invoke set_span in several place for the parameter sources. Those frontend source will be converted to Constant or Var, and no need to indicate output. Therefore we would add one more parameter to the common API:

    def set_span(expr, span, is_output)
    
  2. Control is_output flag to make the final epxrssion be tagged like SOURCE_NAME_OUTPUT. Take the TFLite OP for example again. It would look like:

    def @main (%input: Tensor[(?, ?, 3, 1), float32]) {
        %0 = shape_of(%input, dtype="int32") /* Shape_OUTPUT */;
        %1 = strided_slice(%0, …) /* strided_slice */;
        %2 = squeeze(%1) /* strided_slice_OUTPUT */;
        %3 = expand_dims(%2, axis=0) /* stack */;
        %4 = expand_dims(3, axis=0) /* stack */;
        %5 = expand_dims(3, axis=0) /* stack */;
        %6 = (%3, %4, %5) /* stack */;
        %7 = concatenate(%6) /* stack_OUTPUT */;
    }
    

    We could consider to set the span of output expr like “stack_OUTPUT” or “stack FINAL_OUTPUT”. Or make one more parameter to let user customize it. Not sure whether it is good to spec the string or not.

  3. Personally I would like to set the is_output to False as default. The reason is that most of time, we will then process the build command, or leverages the pass transformation. At this stage the OUTPUT suffix could become meaningless as mentioned above. We can write some more documentation at both set_span API, and each frontend conversion to tell user where and when to turn this parameter on. So that user will have a flexibility to obtain the more precise result (the output location), and will not be confused when approach to the pass transformation or even the lowering process.

Thank you for reading such long description. Hope it could provide you more context. It is great to have a conversion about issue. :smiley:

Thank you very much for the explaination. I agree suffixes may be hard to handle, but I am just wonder why these suffixes become meaningless after passes. At that stage, we still need them to trace back to the original model. I agree after some transformation, some of ops may be eliminated and they may be grouped to different fused functions. But we still can have a clear view about where those OPs come from.

I think just because IR may be changed a lot because of DynamicToStatic/Const Folding/…, those spans become even more important after transformation. It sometimes has more benifits, that when I saw an op without a span, I can guess it it is generated by a pass, not from original model.

Hi @lixiaoquan

Thank you very much for the explanation.

You’re welcome, it’s really great to have a discussion about this issue. :smiley:

It sometimes has more benefits, that when I saw an op without a span, I can guess it it is generated by a pass, not from original model.

Just one more thing before a long, and detailed description about removal of suffix. About the span after processing a pass, in the third part of this RFC we aim to fill span to those Relay passes which are involved in build flow. Therefore you could expect if everything of this RFC is going well, we could have span tagged result after invoking certain passes.

Perhaps in the following I could try to describe the removal of suffixes from several aspects. Hope it could provide you more reasons about why we want to remove them all in the end. :slight_smile:

Reference to a related work

Take the compiler explorer for example, you can see most functions in the high level language(C++) is converted to a sequence of lower level functions(assembly). It’s not necessary to tag a suffix to generated lower level functions. In this scenario, knowing a high level language is converted to which set of level functions is enough. Similarly, the suffixes might not be necessary in Relay, because knowing a frontend model layer is converted to which set of Relay exprs might be enough for the first glance.

How many benefits can we get from suffix?

It is somehow the most important reason in my point of view personally. Suppose what we focus on is “frontend → Relay”, and what would happen if we keep the suffixes after invoking the build() or optimize()? There might be a span-tagged expr like this:

%0 = expr(...) /* span=other_expr_PART_3 */

If user does not check the very first Relay module, which is generated by the API from_{frontend}(), the _PART_ means nothing to them. (The only thing they know is, ok, the generation of this expr is related to “other_expr”, I don’t know the meaning of PART). Even worse, it might be confusing to them when seeing the suffix, like, where can I find the PART? Therefore, if we focus on the conversion between “frontend → Relay” only. It is not a good choice to use suffix when passes are involved.

On the other hand, the value of suffix might be a special indicator of an expr between passes, which means “Relay->Relay”. However, the more we try to enhance the debugging ability, the more we find “sourceName” might not be the best choice for it. For example, suppose I have an IR, which has the original form like this:

# IR1
fn() {
  %0 = reshpae(..) /* span=reshpae_add_PART_1 */
  %1 = add() /* span=reshape_add */
}

Then I invoke a pass, which converts add call to subtract call:

# IR2
fn() {
  %0 = reshpae(..) /* span=reshpae_add_PART_1 */
  %1 = subtract() /* span=reshape_add */
}

Finally I invoke a pass to merge these two expr to an op which defined by myself:

# IR3
fn() {
  %0 = my_expr() /* span=[reshape_add_PART_1, reshape_add] */
}

You can see, well, my_expr call is from the reshpae_add_PART_1 and reshpae_add. However, it seems to be the relation between “IR3 and IR1”. When I wnat to compare the relation between “IR3 and IR2”. I have to rely on the legacy (sourceName of span) from IR1, which might be confusing by its literal meaning. The better choice in this scenario might be using the line, column information. So that the relationship between IR3 and IR2 can be more straightforward. Yet there are still something wait to be delt if we want to use the line, column information. Like, how to define the line number of an IR module.

In summary, in the conversion of “frontend-> Relay”, suffix seems to be not necessary. On the other hand, in the conversion of “Relay → Relay”, use suffix is not straightforward enough.

Any workaround?

If suffix is something really hard to give up. Perhaps a post-processing pass can be a workaround for it. For example, a pass traverses the whole Relay IR, and tag _PART_ to those spans with the same SourceName. With the help of this pass, we can have a very similar result like the previous reverted PR.

Thank you for reading such a long reply. It feels great to have a discussion with you about this topic. :smiley:

Hi chunit,

It seems your explorer can find the boundary of a group of IRs from a layer, that is wonderful. Does it use the same Span to determine a group? If so, I guess there is a quick searching to find all the IRs with same Span.

If we need to detemine the group by eyes, we have to do a quick search too.

Is it possible to expand span a little bit to indicate it is an “Output”? So the original source doesn’t have to be changed with any tag and it will not mess up other parts of your design. And that “output” field can be printed in text for quick comparation between relay IR and original model(just for human).

def @main (%input: Tensor[(?, ?, 3, 1), float32]) {
    %0 = shape_of(%input, dtype="int32") /*Shape*/ /* Shape_OUTPUT */;
    %1 = strided_slice(%0, …) /* strided_slice */;
    %2 = squeeze(%1) /* strided_slice*/ /* strided_slice_OUTPUT*/;
    %3 = expand_dims(%2, axis=0) /* stack */;
    %4 = expand_dims(3, axis=0) /* stack */;
    %5 = expand_dims(3, axis=0) /* stack */;
    %6 = (%3, %4, %5) /* stack */;
    %7 = concatenate(%6) /* stack */ /* stack_OUTPUT */ ;
}
def @main (%input: Tensor[(?, ?, 3, 1), float32]) {
    %0 = shape_of(%input, dtype="int32") /*Shape*/ /* Shape_OUTPUT */;
    %1 = strided_slice(%0, …) /* strided_slice */;
    %2 = squeeze(%1) /* strided_slice*/ /* strided_slice_OUTPUT*/;
    %3 = expand_dims(%2, axis=0) /* stack */;
    %4 = expand_dims(3, axis=0) /* stack */;
    %5 = expand_dims(3, axis=0) /* stack */;
    %10 = fn(%001, %002, %003) {}
       %6 = (%001, %002, %003) /* stack */;
       %7 = concatenate(%6) /* stack */ /* stack_OUTPUT */ ;
    }
    %11 = %10(%3, %4, %5)
}

In case like that, when IRs from a layer are put in different functions, we can still find output of stack quickly.

Hi @lixiaoquan

Thanks for fast reply! :smiley:

Does it use the same Span to determine a group?

About the Explorer, yes we group a set of exprs based on the span. The exprs with the same span have the same color. When hovering the mouse over, the set of exprs become a bit darker. Besides, our UI provides search functionality too. For your reference, the following picture could demonstrate the behavior:

Is it possible to expand span a little bit to indicate it is an “Output”?

Thank you for point out this way to deal with the problem. :smiley:

Should be doable, we can achieve this by modifying the attributes of Span and the relay text printer. The attribute of Span would become like this:

class SpanNode : public Object {
 public:
  SourceName source_name;
  int line;
  int column;
  int end_line;
  int end_column;
  bool fronted_output;
  //...
  }

Yet I’m not pretty sure whether it would be a better choice than the way I mentioned in here, which aims to let advanced users to configure on their own. The main reason in my point of view is that, Span seems to be designed for “Expr”, the parent class of “RelayExpr”. Take a look at the relationship of the class BaseExprNode. We can see this base class derives RelayExprNode and PrimExprNodelater. It might be too specific to add one more attribute for indicating whether the tagged expr is corresponding to output of its frontend layer.

class BaseExprNode : public Object {
 public:
  /*!
   * \brief Span that points to the original source code.
   *        Reserved debug information.
   */
  mutable Span span;
  static constexpr const char* _type_key = "BaseExpr";
  //...
};
class RelayExprNode : public BaseExprNode {...}
class PrimExprNode : public BaseExprNode {...}

Thanks for reading again. If I miss something, please don’t hesitate to let me know. :smiley:

Hi chunit,

I understand the point Span is not only for Relay. I think Span is designed for relay text format in the very beginning, because it has source_name, line, colume, and there is a AnnotateSpan() pass to get line and colume. It has already been a compromise to fill layer name to source name.

Hi @lixiaoquan

It has already been a compromise to fill layer name to source name.

I see, I had read the comments in your PR before. Personally, I like the choice of using the layer as source name, because we cannot always get the source file name of a neural network model. Somehow, I prefer layer name than the source file name :smiley:.

I will try to summarize what we have discussed in the following. I believe we can find a way to keep the generality and also get output information.

Back to topic, currently we have two choices to mark the converted expr which is corresponding to the frontend output.

1. Extend the object Span: Add an attribute

As you mentioned in the reply, we can add an attribute to mark the span is the output of a frontend layer. The implementation would be like:

class SpanNode : public Object {
 public:
  SourceName source_name;
  int line;
  int column;
  int end_line;
  int end_column;
  bool fronted_output;
  //...
  }

The pro of this method is very straightforward. We can obtain the output information by only modifying the attribute of span, and the relay text printer.

The con of it is that, this attribute is too specific for the frontend → relay conversion.

2. Extend the Mutator SpanFiller: Add a user defiended callback

I have a disscussion with my team members about how to make it more generic. The following is the outline of what we thought:

SpanFiller is the core functionality of set_span. We might implement an interface for SpanFiller. Then SpanFiller invokes it to rewrite the source name. User should override interface to achieve the behavior they want.

To be more precise, the SpanFiller might become like the following:

class _SpanFiller(ExprMutator):
    """SpanFiller"""

    def __init__(self, span, annotator...):
        #...
        self._annotator = annotator

There are several ways to pass the annotator to SpanFiller, because set_span is invoked in internal part of the from_{frontend} API. We might obtain this annotator via accessing to the PassContext, change the from_{fornted} interface, and so on. Here we take the PassContext way as example:

my_annotator = FrontendSpanAnnotator(...)
with tvm.transform.PassContext(config={relay.frontend.span_annotator=my_annotator}):
    mod, param = from_{frontend}(...)

About the annotator, it might look something like this:

# Interface class
class FrontendSpanAnnotatorBase:
    # source_str is obtained from SpanFiller
    def annotate_source_name(self, source_str):
        raise NotImplementedError()

    def generate_span(self):
        return tvm.relay.Span(
            self.annotate_source_name(source_str), 0, 0, 0, 0
        )

User should implement the interface. Then SpanFiller will use this annotator to change the source name of current span. Take the visitor of var for example. It would be like:

def visit_var(self, var):
    if self._annotator:
        return _expr.VarWithFields(..., self._annotator.generate_span())
    return _expr.VarWithFields(var, var.vid, var.type_annotation, None, self._span)

It’s easy to tag the suffix “_OUTPUT” to the source name with the annotator. The pros of this method are first, it provides users high flexibility to define their own annotating method. second, it also keeps the generality of the Span object. The con, of course, it requires users a fundamental knowledge to make the annotator.

These two methods are what we have so far. Thank you for discussing this issue with us. Now we have more ideas about how to tackle this problem :slight_smile: .

One more thing, because our first PR is almost done, and the changes of that PR do not change the behavior of current _set_span. Therefore, we would like to keep that PR unchanged. If the proposal here looks good to you. We can submit one PR to add the functionality we talked in here. Would it be fine to you?

Thank you for reading. Any comment is appreciated. :smiley:

Hi chunit,

I appreciate that you can consider my comment.

Personlly I perfer to 1), because that is a minor change and we can still make use of TVM Explorer. If source_name is changed, the changed span may not be able to work with TVM Explorer.

H @lixiaoquan

Sorry for late reply. Thanks a lot for keeping discussing with us! :grinning_face_with_smiling_eyes:

Personally I prefer to 1), because that is a minor change

I see. I agree, method 1) is pretty straightforward. we are fine with both methods. Yet personally I feel method 2) could be a better foundation for any customized requirement. @areusch would you mind to give us some more suggestions in the point of maintainer’s view? Thanks :smiley:

we can still make use of TVM Explorer. If source_name is changed, the changed span may not be able to work with TVM Explorer.

About the Explorer, both methods could fit into Explorer, and both methods require Explorer to add some more text parsing functions. The effort is the same.