How to read out the intermediate value in Relay IR?

Hello TVM community,

I have a question regarding how to read out the intermediate value in Relay IR.

For the mod that the user creates manually, I know we can set arbitrary output with the proper setting.

For example, to read out the output_0, output_1, output_2, we can set:

data = relay.var("data", relay.TensorType(dshape, "float32"))
output_0 = whatever operations....
output_1 = whatever operations....
output_2 = whatever operations....
func = relay.Function([data], relay.Tuple([output_0, output_1, output_2]))
mod = tvm.IRModule.from_expr(func)

However, for the mod that converts from another DNN framework, I am wondering how to add such arbitrary output?

For example, I have successfully converted the distilBERT model PyTorch to TVM relay IR. Here is the output when I use print(“original mod: \n”, mod.astext(show_meta_data=False)):

def @main(%tf_distil_bert_for_sequence_classification/distilbert/embeddings/Gather/resource: Tensor[(30522, 768), float32], %x: Tensor[(1, 128), int32], %tf_distil_bert_for_sequence_classification/distilbert/embeddings/Gather_1/resource: Tensor[(512, 768), float32], %tf_distil_bert_for_sequence_classification/distilbert/embeddings/LayerNorm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/embeddings/LayerNorm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/q_lin/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/q_lin/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/k_lin/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/k_lin/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/ones: Tensor[(1, 128), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/v_lin/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/v_lin/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/out_lin/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/out_lin/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/sa_layer_norm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/sa_layer_norm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin1/Tensordot/ReadVariableOp/resource: Tensor[(768, 3072), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin1/BiasAdd/ReadVariableOp/resource: Tensor[(3072), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin2/Tensordot/ReadVariableOp/resource: Tensor[(3072, 768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin2/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/output_layer_norm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/output_layer_norm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32]) 


{
  %0 = expand_dims(meta[relay.Constant][0] /* ty=Tensor[(128), int32] */, axis=0) /* ty=Tensor[(1, 128), int32] */;
  %1 = take(%tf_distil_bert_for_sequence_classification/distilbert/embeddings/Gather_1/resource, %0, axis=0) /* ty=Tensor[(1, 128, 768), float32] */;
  %2 = take(%tf_distil_bert_for_sequence_classification/distilbert/embeddings/Gather/resource, %x, axis=0) /* ty=Tensor[(1, 128, 768), float32] */;
  %3 = tile(%1, reps=[1, 1, 1]) /* ty=Tensor[(1, 128, 768), float32] */;
  %4 = add(%2, %3) /* ty=Tensor[(1, 128, 768), float32] */;
  %5 = mean(%4, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), float32] */;
  %6 = subtract(%4, %5) /* ty=Tensor[(1, 128, 768), float32] */;
  %7 = multiply(%6, %6) /* ty=Tensor[(1, 128, 768), float32] */;
  %8 = mean(%7, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), float32] */;
  %9 = add(%8, 1e-12f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), float32] */;
  %10 = power(%9, -0.5f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), float32] */;
  %11 = multiply(%10, %tf_distil_bert_for_sequence_classification/distilbert/embeddings/LayerNorm/batchnorm/mul/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %12 = multiply(%5, %11) /* ty=Tensor[(1, 128, 768), float32] */;
  %13 = multiply(%4, %11) /* ty=Tensor[(1, 128, 768), float32] */;
  %14 = subtract(%tf_distil_bert_for_sequence_classification/distilbert/embeddings/LayerNorm/batchnorm/ReadVariableOp/resource, %12) /* ty=Tensor[(1, 128, 768), float32] */;
  %15 = add(%13, %14) /* ty=Tensor[(1, 128, 768), float32] */;
  %16 = reshape(%15, newshape=[128, 768]) /* ty=Tensor[(128, 768), float32] */;
  %17 = transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/q_lin/Tensordot/ReadVariableOp/resource, axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
  %18 = nn.dense(%16, %17, units=768) /* ty=Tensor[(128, 768), float32] */;
  %19 = reshape(%18, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), float32] */;
  %20 = add(%19, %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/q_lin/BiasAdd/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %21 = reshape(%20, newshape=[1, -1, 12, 64]) /* ty=Tensor[(1, 128, 12, 64), float32] */;
  %22 = cast(768 /* ty=int32 */, dtype="float64") /* ty=float64 */;
  %23 = cast(12 /* ty=int32 */, dtype="float64") /* ty=float64 */;
  %24 = divide(%22, %23) /* ty=float64 */;
  %25 = cast(%24, dtype="int32") /* ty=int32 */;
  %26 = cast(%25, dtype="float32") /* ty=float32 */;
  %27 = transpose(%21, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 12, 128, 64), float32] */;
  %28 = power(%26, -0.5f /* ty=float32 */) /* ty=float32 */;
  %29 = multiply(%27, %28) /* ty=Tensor[(1, 12, 128, 64), float32] */;
  %30 = reshape(%15, newshape=[128, 768]) /* ty=Tensor[(128, 768), float32] */;
  %31 = transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/k_lin/Tensordot/ReadVariableOp/resource, axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
  %32 = nn.dense(%30, %31, units=768) /* ty=Tensor[(128, 768), float32] */;
  %33 = reshape(%32, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), float32] */;
  %34 = add(%33, %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/k_lin/BiasAdd/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %35 = reshape(%34, newshape=[1, -1, 12, 64]) /* ty=Tensor[(1, 128, 12, 64), float32] */;
  %36 = transpose(%35, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 12, 128, 64), float32] */;
  %37 = reshape(%29, newshape=[12, 128, 64]) /* ty=Tensor[(12, 128, 64), float32] */;
  %38 = reshape(%36, newshape=[12, 128, 64]) /* ty=Tensor[(12, 128, 64), float32] */;
  %39 = nn.batch_matmul(%37, %38, meta[relay.attrs.BatchMatmulAttrs][0]) /* ty=Tensor[(12, 128, 128), float32] */;
  %40 = reshape(%tf_distil_bert_for_sequence_classification/distilbert/ones, newshape=[1, 1, 1, 128]) /* ty=Tensor[(1, 1, 1, 128), float32] */;
  %41 = subtract(1f /* ty=float32 */, %40) /* ty=Tensor[(1, 1, 1, 128), float32] */;
  %42 = reshape(%39, newshape=[1, 12, 128, 128]) /* ty=Tensor[(1, 12, 128, 128), float32] */;
  %43 = multiply(1e+30f /* ty=float32 */, %41) /* ty=Tensor[(1, 1, 1, 128), float32] */;
  %44 = subtract(%42, %43) /* ty=Tensor[(1, 12, 128, 128), float32] */;
  %45 = nn.softmax(%44) /* ty=Tensor[(1, 12, 128, 128), float32] */;
  %46 = reshape(%15, newshape=[128, 768]) /* ty=Tensor[(128, 768), float32] */;
  %47 = transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/v_lin/Tensordot/ReadVariableOp/resource, axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
  %48 = nn.dense(%46, %47, units=768) /* ty=Tensor[(128, 768), float32] */;
  %49 = reshape(%48, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), float32] */;
  %50 = add(%49, %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/v_lin/BiasAdd/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %51 = reshape(%50, newshape=[1, -1, 12, 64]) /* ty=Tensor[(1, 128, 12, 64), float32] */;
  %52 = transpose(%51, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 12, 128, 64), float32] */;
  %53 = reshape(%52, newshape=[12, 128, 64]) /* ty=Tensor[(12, 128, 64), float32] */;
  %54 = reshape(%45, newshape=[12, 128, 128]) /* ty=Tensor[(12, 128, 128), float32] */;
  %55 = transpose(%53, axes=[0, 2, 1]) /* ty=Tensor[(12, 64, 128), float32] */;
  %56 = nn.batch_matmul(%54, %55, meta[relay.attrs.BatchMatmulAttrs][1]) /* ty=Tensor[(12, 128, 64), float32] */;
  %57 = reshape(%56, newshape=[1, 12, 128, 64]) /* ty=Tensor[(1, 12, 128, 64), float32] */;
  %58 = transpose(%57, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 128, 12, 64), float32] */;
  %59 = reshape(%58, newshape=[1, -1, 768]) /* ty=Tensor[(1, 128, 768), float32] */;
  %60 = reshape(%59, newshape=[128, 768]) /* ty=Tensor[(128, 768), float32] */;
  %61 = transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/out_lin/Tensordot/ReadVariableOp/resource, axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
  %62 = nn.dense(%60, %61, units=768) /* ty=Tensor[(128, 768), float32] */;
  %63 = reshape(%62, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), float32] */;
  %64 = add(%63, %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/out_lin/BiasAdd/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %65 = add(%64, %15) /* ty=Tensor[(1, 128, 768), float32] */;
  %66 = mean(%65, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), float32] */;
  %67 = subtract(%65, %66) /* ty=Tensor[(1, 128, 768), float32] */;
  %68 = multiply(%67, %67) /* ty=Tensor[(1, 128, 768), float32] */;
  %69 = mean(%68, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), float32] */;
  %70 = add(%69, 1e-12f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), float32] */;
  %71 = power(%70, -0.5f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), float32] */;
  %72 = multiply(%71, %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/sa_layer_norm/batchnorm/mul/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %73 = multiply(%66, %72) /* ty=Tensor[(1, 128, 768), float32] */;
  %74 = multiply(%65, %72) /* ty=Tensor[(1, 128, 768), float32] */;
  %75 = subtract(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/sa_layer_norm/batchnorm/ReadVariableOp/resource, %73) /* ty=Tensor[(1, 128, 768), float32] */;
  %76 = add(%74, %75) /* ty=Tensor[(1, 128, 768), float32] */;
  %77 = reshape(%76, newshape=[128, 768]) /* ty=Tensor[(128, 768), float32] */;
  %78 = transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin1/Tensordot/ReadVariableOp/resource, axes=[1, 0]) /* ty=Tensor[(3072, 768), float32] */;
  %79 = nn.dense(%77, %78, units=3072) /* ty=Tensor[(128, 3072), float32] */;
  %80 = reshape(%79, newshape=[1, 128, 3072]) /* ty=Tensor[(1, 128, 3072), float32] */;
  %81 = add(%80, %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin1/BiasAdd/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 3072), float32] */;
  %82 = divide(%81, 1.41421f /* ty=float32 */) /* ty=Tensor[(1, 128, 3072), float32] */;
  %83 = erf(%82) /* ty=Tensor[(1, 128, 3072), float32] */;
  %84 = multiply(0.5f /* ty=float32 */, %81) /* ty=Tensor[(1, 128, 3072), float32] */;
  %85 = add(1f /* ty=float32 */, %83) /* ty=Tensor[(1, 128, 3072), float32] */;
  %86 = multiply(%84, %85) /* ty=Tensor[(1, 128, 3072), float32] */;
  %87 = reshape(%86, newshape=[128, 3072]) /* ty=Tensor[(128, 3072), float32] */;
  %88 = transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin2/Tensordot/ReadVariableOp/resource, axes=[1, 0]) /* ty=Tensor[(768, 3072), float32] */;
  %89 = nn.dense(%87, %88, units=768) /* ty=Tensor[(128, 768), float32] */;
  %90 = reshape(%89, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), float32] */;
  %91 = add(%90, %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin2/BiasAdd/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %92 = add(%91, %76) /* ty=Tensor[(1, 128, 768), float32] */;
  %93 = mean(%92, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), float32] */;
  %94 = subtract(%92, %93) /* ty=Tensor[(1, 128, 768), float32] */;
  %95 = multiply(%94, %94) /* ty=Tensor[(1, 128, 768), float32] */;
  %96 = mean(%95, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), float32] */;
  %97 = add(%96, 1e-12f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), float32] */;
  %98 = power(%97, -0.5f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), float32] */;
  %99 = multiply(%98, %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/output_layer_norm/batchnorm/mul/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %100 = multiply(%93, %99) /* ty=Tensor[(1, 128, 768), float32] */;
  %101 = multiply(%92, %99) /* ty=Tensor[(1, 128, 768), float32] */;
  %102 = subtract(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/output_layer_norm/batchnorm/ReadVariableOp/resource, %100) /* ty=Tensor[(1, 128, 768), float32] */;
  add(%101, %102) /* ty=Tensor[(1, 128, 768), float32] */
}

To read out the output, I can use “module.get_output(0)”,

However, this operation only allow user to read the output from the last operation, which is

add(%101, %102) /* ty=Tensor[(1, 128, 768), float32]

I am wondering is it possible to print out any intermediate value like %60 or %80?

Can the user modify the number of outputs in Relay IR and read them out?

Thanks :slight_smile:

cc @comaniac @AndrewZhaoLuo @masahi

I heard that’s possible with debug_executor. But I’ve never tried it. Can you take a look?

1 Like

https://github.com/AndrewZhaoLuo/TVM-Sandbox/blob/main/relay/graph_debugger_example.py Here is an example of using graph debugger. My apologies as it isn’t very complete and the tensor dumps you’ll have to manually associate.

If you use graph debugger there is also an interesting function get_node_output which might also be promising:

Finally debug_get_output is also a function you can call with the above example with rt_mod: https://github.com/apache/tvm/blob/main/python/tvm/contrib/debugger/debug_executor.py#L234

2 Likes

Thanks for your reply. I will take a look at how to use debug_executor to enable such a function.

Also, I am asking this question because this is somehow related to pipeline execution. Thus, I am still wondering is it possible for that user can we register operations in Relay IR as new outputs.

In my case, I split 12 layer BERT models into two subgraphs, the first one contains the first 4 layers and the second one contains the last 8 layers. When following Relay IR rules, both subgraphs only have one global output, wherein is the last operation.

BERT in Relay IR:

running now:  BERT
original mod: 
 #[version = "0.0.5"]
fn (%tf_bert_for_sequence_classification/bert/embeddings/Gather/resource: Tensor[(30522, 768), float32], %x: Tensor[(1, 128), int32], %tf_bert_for_sequence_classification/bert/embeddings/Gather_1/resource: Tensor[(512, 768), float32], %tf_bert_for_sequence_classification/bert/embeddings/Gather_2/resource: Tensor[(2, 768), float32], %tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/value/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/value/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/output/dense/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/output/dense/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/output/LayerNorm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/output/LayerNorm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/intermediate/dense/Tensordot/ReadVariableOp/resource: Tensor[(768, 3072), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/intermediate/dense/BiasAdd/ReadVariableOp/resource: Tensor[(3072), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/output/dense/Tensordot/ReadVariableOp/resource: Tensor[(3072, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/output/dense/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/output/LayerNorm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._0/output/LayerNorm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/self/query/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/self/query/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/self/key/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/self/key/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/self/value/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/self/value/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/output/dense/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/output/dense/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/output/LayerNorm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/attention/output/LayerNorm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/intermediate/dense/Tensordot/ReadVariableOp/resource: Tensor[(768, 3072), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/intermediate/dense/BiasAdd/ReadVariableOp/resource: Tensor[(3072), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/output/dense/Tensordot/ReadVariableOp/resource: Tensor[(3072, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/output/dense/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/output/LayerNorm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._1/output/LayerNorm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32], ....%tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/self/query/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/self/query/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/self/key/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/self/key/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/self/value/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/self/value/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/output/dense/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/output/dense/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/output/LayerNorm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/attention/output/LayerNorm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/intermediate/dense/Tensordot/ReadVariableOp/resource: Tensor[(768, 3072), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/intermediate/dense/BiasAdd/ReadVariableOp/resource: Tensor[(3072), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/output/dense/Tensordot/ReadVariableOp/resource: Tensor[(3072, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/output/dense/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/output/LayerNorm/batchnorm/mul/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._11/output/LayerNorm/batchnorm/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/pooler/dense/MatMul/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/pooler/dense/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/classifier/MatMul/ReadVariableOp/resource: Tensor[(768, 2), float32], %tf_bert_for_sequence_classification/classifier/BiasAdd/ReadVariableOp/resource: Tensor[(2), float32]) {
  %0 = expand_dims(meta[relay.Constant][0], axis=0) /* tf_bert_for_sequence_classification/bert/embeddings/ExpandDims */;
  %1 = take(%tf_bert_for_sequence_classification/bert/embeddings/Gather_1/resource, %0, axis=0) /* tf_bert_for_sequence_classification/bert/embeddings/Gather_1 */;
  %2 = take(%tf_bert_for_sequence_classification/bert/embeddings/Gather/resource, %x, axis=0) /* tf_bert_for_sequence_classification/bert/embeddings/Gather */;
  %3 = tile(%1, reps=[1, 1, 1]) /* tf_bert_for_sequence_classification/bert/embeddings/Tile */;
  %4 = full(0, shape=[1, 128], dtype="int32") /* tf_bert_for_sequence_classification/bert/Fill_1 */;
  %5 = add(%2, %3) /* tf_bert_for_sequence_classification/bert/embeddings/add/add */;
  %6 = take(%tf_bert_for_sequence_classification/bert/embeddings/Gather_2/resource, %4, axis=0) /* tf_bert_for_sequence_classification/bert/embeddings/Gather_2 */;
  %7 = add(%5, %6) /* tf_bert_for_sequence_classification/bert/embeddings/add/add_1 */;
  %8 = mean(%7, axis=[2], keepdims=True) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/moments/mean */;
  %9 = subtract(%7, %8);
  %10 = multiply(%9, %9) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/moments/SquaredDifference */;
  %11 = mean(%10, axis=[2], keepdims=True) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/moments/variance */;
  %12 = add(%11, 1e-12f) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/add */;
  %13 = power(%12, -0.5f) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/Rsqrt */;
  %14 = multiply(%13, %tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/mul/ReadVariableOp/resource) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/mul */;
  %15 = multiply(%8, %14) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/mul_2 */;
  %16 = multiply(%7, %14) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/mul_1 */;
  %17 = subtract(%tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/ReadVariableOp/resource, %15) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/sub */;
  %18 = add(%16, %17) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/add_1 */;
  %19 = reshape(%18, newshape=[128, 768]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/Tensordot/Reshape */;
  %20 = transpose(%tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/Tensordot/ReadVariableOp/resource, axes=[1, 0]);
  %21 = nn.dense(%19, %20, units=768) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/Tensordot/MatMul */;
  %22 = reshape(%21, newshape=[1, 128, 768]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/Tensordot */;
  %23 = add(%22, %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/BiasAdd/ReadVariableOp/resource) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/BiasAdd */;
  %24 = reshape(%23, newshape=[1, -1, 12, 64]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/Reshape */;
  %25 = transpose(%24, axes=[0, 2, 1, 3]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/transpose */;
  %26 = reshape(%18, newshape=[128, 768]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/Tensordot/Reshape */;
  %27 = transpose(%tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/Tensordot/ReadVariableOp/resource, axes=[1, 0]);
  %28 = nn.dense(%26, %27, units=768) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/Tensordot/MatMul */;
  %29 = reshape(%28, newshape=[1, 128, 768]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/Tensordot */;
  %30 = add(%29, %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/BiasAdd/ReadVariableOp/resource) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/BiasAdd */;
  %31 = reshape(%30, newshape=[1, -1, 12, 64]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/Reshape_1 */;
  %32 = transpose(%31, axes=[0, 2, 1, 3]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/transpose_1 */;
  %33 = reshape(%25, newshape=[12, 128, 64]);
  %34 = reshape(%32, newshape=[12, 128, 64]);
  %35 = nn.batch_matmul(%33, %34, transpose_b=True);
  %36 = reshape(%35, newshape=[1, 12, 128, 128]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/MatMul */;
  %37 = full(1, shape=[1, 128], dtype="int32") /* tf_bert_for_sequence_classification/bert/Fill */;
  %38 = reshape(%37, newshape=[1, 1, 1, 128]) /* tf_bert_for_sequence_classification/bert/Reshape */;
  %39 = cast(%38, dtype="float32") /* tf_bert_for_sequence_classification/bert/Cast */;
  %40 = subtract(1f, %39) /* tf_bert_for_sequence_classification/bert/Sub */;
  %41 = divide(%36, 8f) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/truediv */;
  %42 = multiply(%40, -10000f) /* tf_bert_for_sequence_classification/bert/Mul */;
  .....
  %361 = reshape(%339, newshape=[128, 768]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._4/attention/self/value/Tensordot/Reshape */;
  ...
  %974 = transpose(%tf_bert_for_sequence_classification/bert/pooler/dense/MatMul/ReadVariableOp/resource, axes=[1, 0]);
  %975 = nn.dense(%973, %974, units=768) /* tf_bert_for_sequence_classification/bert/pooler/dense/MatMul */;
  %976 = add(%975, %tf_bert_for_sequence_classification/bert/pooler/dense/BiasAdd/ReadVariableOp/resource) /* tf_bert_for_sequence_classification/bert/pooler/dense/BiasAdd */;
  %977 = tanh(%976) /* tf_bert_for_sequence_classification/bert/pooler/dense/Tanh */;
  %978 = transpose(%tf_bert_for_sequence_classification/classifier/MatMul/ReadVariableOp/resource, axes=[1, 0]);
  %979 = nn.dense(%977, %978, units=2) /* tf_bert_for_sequence_classification/classifier/MatMul */;
  add(%979, %tf_bert_for_sequence_classification/classifier/BiasAdd/ReadVariableOp/resource) /* tf_bert_for_sequence_classification/classifier/BiasAdd */
}

first subgraph (mod 0) in Relay IR:

mods 0: def @main(%tf_bert_for_sequence_classification/bert/embeddings/Gather/resource: Tensor[(30522, 768), float32], %x: Tensor[(1, 128), int32], .... (ignore){
  %0 = expand_dims(meta[relay.Constant][0], axis=0) /* tf_bert_for_sequence_classification/bert/embeddings/ExpandDims */;
  %1 = take(%tf_bert_for_sequence_classification/bert/embeddings/Gather_1/resource, %0, axis=0) /* tf_bert_for_sequence_classification/bert/embeddings/Gather_1 */;
  %2 = take(%tf_bert_for_sequence_classification/bert/embeddings/Gather/resource, %x, axis=0) /* tf_bert_for_sequence_classification/bert/embeddings/Gather */;
  %3 = tile(%1, reps=[1, 1, 1]) /* tf_bert_for_sequence_classification/bert/embeddings/Tile */;
  %4 = full(0, shape=[1, 128], dtype="int32") /* tf_bert_for_sequence_classification/bert/Fill_1 */;
  %5 = add(%2, %3) /* tf_bert_for_sequence_classification/bert/embeddings/add/add */;
  %6 = take(%tf_bert_for_sequence_classification/bert/embeddings/Gather_2/resource, %4, axis=0) /* tf_bert_for_sequence_classification/bert/embeddings/Gather_2 */;
  %7 = add(%5, %6) /* tf_bert_for_sequence_classification/bert/embeddings/add/add_1 */;
  %8 = mean(%7, axis=[2], keepdims=True) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/moments/mean */;
  %9 = subtract(%7, %8);
  %10 = multiply(%9, %9) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/moments/SquaredDifference */;
  %11 = mean(%10, axis=[2], keepdims=True) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/moments/variance */;
  %12 = add(%11, 1e-12f) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/add */;
  %13 = power(%12, -0.5f) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/Rsqrt */;
  %14 = multiply(%13, %tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/mul/ReadVariableOp/resource) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/mul */;
  %15 = multiply(%8, %14) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/mul_2 */;
  %16 = multiply(%7, %14) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/mul_1 */;
  %17 = subtract(%tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/ReadVariableOp/resource, %15) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/sub */;
  %18 = add(%16, %17) /* tf_bert_for_sequence_classification/bert/embeddings/LayerNorm/batchnorm/add_1 */;
  %19 = reshape(%18, newshape=[128, 768]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/Tensordot/Reshape */;
  %20 = transpose(%tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/Tensordot/ReadVariableOp/resource, axes=[1, 0]);
  %21 = nn.dense(%19, %20, units=768) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/Tensordot/MatMul */;
  %22 = reshape(%21, newshape=[1, 128, 768]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/Tensordot */;
  %23 = add(%22, %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/BiasAdd/ReadVariableOp/resource) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/query/BiasAdd */;
  %24 = reshape(%23, newshape=[1, -1, 12, 64]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/Reshape */;
  %25 = transpose(%24, axes=[0, 2, 1, 3]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/transpose */;
  %26 = reshape(%18, newshape=[128, 768]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/Tensordot/Reshape */;
  %27 = transpose(%tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/Tensordot/ReadVariableOp/resource, axes=[1, 0]);
  %28 = nn.dense(%26, %27, units=768) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/Tensordot/MatMul */;
  %29 = reshape(%28, newshape=[1, 128, 768]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/Tensordot */;
  %30 = add(%29, %tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/BiasAdd/ReadVariableOp/resource) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/key/BiasAdd */;
  %31 = reshape(%30, newshape=[1, -1, 12, 64]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/Reshape_1 */;
  %32 = transpose(%31, axes=[0, 2, 1, 3]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/transpose_1 */;
  %33 = reshape(%25, newshape=[12, 128, 64]);
  %34 = reshape(%32, newshape=[12, 128, 64]);
  %35 = nn.batch_matmul(%33, %34, transpose_b=True);
  %36 = reshape(%35, newshape=[1, 12, 128, 128]) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/MatMul */;
  %37 = full(1, shape=[1, 128], dtype="int32") /* tf_bert_for_sequence_classification/bert/Fill */;
  %38 = reshape(%37, newshape=[1, 1, 1, 128]) /* tf_bert_for_sequence_classification/bert/Reshape */;
  %39 = cast(%38, dtype="float32") /* tf_bert_for_sequence_classification/bert/Cast */;
  %40 = subtract(1f, %39) /* tf_bert_for_sequence_classification/bert/Sub */;
  %41 = divide(%36, 8f) /* tf_bert_for_sequence_classification/bert/encoder/layer_._0/attention/self/truediv */;
  %42 = multiply(%40, -10000f) /* tf_bert_for_sequence_classification/bert/Mul */;
..... (ignore)

second subgraph (mod 1) in Relay IR:

mods 1: def @main(%x: Tensor[(1, 128, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._4/attention/self/query/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._4/attention/self/query/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._4/attention/self/key/Tensordot/ReadVariableOp/resource: Tensor[(768, 768), float32], %tf_bert_for_sequence_classification/bert/encoder/layer_._4/attention/self/key/BiasAdd/ReadVariableOp/resource: Tensor[(768), float32], %x1: Tensor[(1, 1, 1, 128), float32],....
%0 = reshape(%x, newshape=[128, 768]) /* ty=Tensor[(128, 768), float32] */;
  %1 = transpose(%tf_bert_for_sequence_classification/bert/encoder/layer_._4/attention/self/query/Tensordot/ReadVariableOp/resource, axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
  %2 = nn.dense(%0, %1, units=768) /* ty=Tensor[(128, 768), float32] */;
  %3 = reshape(%2, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), float32] */;
  %4 = add(%3, %tf_bert_for_sequence_classification/bert/encoder/layer_._4/attention/self/query/BiasAdd/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %5 = reshape(%4, newshape=[1, -1, 12, 64]) /* ty=Tensor[(1, 128, 12, 64), float32] */;
  %6 = transpose(%5, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 12, 128, 64), float32] */;
  %7 = reshape(%x, newshape=[128, 768]) /* ty=Tensor[(128, 768), float32] */;
  %8 = transpose(%tf_bert_for_sequence_classification/bert/encoder/layer_._4/attention/self/key/Tensordot/ReadVariableOp/resource, axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
  %9 = nn.dense(%7, %8, units=768) /* ty=Tensor[(128, 768), float32] */;
  %10 = reshape(%9, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), float32] */;
  %11 = add(%10, %tf_bert_for_sequence_classification/bert/encoder/layer_._4/attention/self/key/BiasAdd/ReadVariableOp/resource) /* ty=Tensor[(1, 128, 768), float32] */;
  %12 = reshape(%11, newshape=[1, -1, 12, 64]) /* ty=Tensor[(1, 128, 12, 64), float32] */;
  %13 = transpose(%12, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 12, 128, 64), float32] */;
  %14 = reshape(%6, newshape=[12, 128, 64]) /* ty=Tensor[(12, 128, 64), float32] */;
  %15 = reshape(%13, newshape=[12, 128, 64]) /* ty=Tensor[(12, 128, 64), float32] */;
  %16 = nn.batch_matmul(%14, %15, transpose_b=True) /* ty=Tensor[(12, 128, 128), float32] */;
  %17 = reshape(%16, newshape=[1, 12, 128, 128]) /* ty=Tensor[(1, 12, 128, 128), float32] */;
  %18 = divide(%17, 8f /* ty=float32 */) /* ty=Tensor[(1, 12, 128, 128), float32] */;
  %19 = add(%18, %x1) /* ty=Tensor[(1, 12, 128, 128), float32] */;
  %20 = nn.softmax(%19) /* ty=Tensor[(1, 12, 128, 128), float32] */;
  .....

When I check the data dependency, I notice there are two data dependencies:

  1. The last operation of the first subgraph → %x: Tensor[(1, 128, 768), float32] of the second subgraph For this one, I can follow the reference code shown below to set the output of the first output to go into the input of the second subgraph.

  2. %42 of the first subgraph → %x1: Tensor[(1, 1, 1, 128), float32] of the second subgraph. This operation is constant that goes to every layer. (e.g, %19 in second subgraph) However, for this operation, I cannot send the data dependency to the next subgraph since it is not registered as global output in the first subgraph.

Thus, I am still wondering is it possible for that user can we register operations in Relay IR as new outputs to read them out (or send them to another subgraph, in my case).

Thanks for your help in advance.

cc @hjiang

============================================================

Reference code:

Kindly ask does anyone have any thought on that?