[BYOC] multi-layer subgraphs

max1996 · September 29, 2020, 7:39am

def @main(%input_1: Tensor[(1, 224, 224, 3), float32]) -> Tensor[(1, 1000), float32] {
  %69 = nn.pad(%input_1, pad_width=[[0, 0], [0, 1], [0, 1], [0, 0]]) /* ty=Tensor[(1, 225, 225, 3), float32] */;
  %70 = multiply(%69, 16f /* ty=float32 */) /* ty=Tensor[(1, 225, 225, 3), float32] */;
  %71 = round(%70) /* ty=Tensor[(1, 225, 225, 3), float32] */;
  %72 = clip(%71, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 225, 225, 3), float32] */;
  %73 = cast(%72, dtype="int8") /* ty=Tensor[(1, 225, 225, 3), int8] */;
  %74 = @tinyai_0(%73) /* ty=Tensor[(1, 112, 112, 32), int32] */;
  %75 = add(%74, 64 /* ty=int32 */) /* ty=Tensor[(1, 112, 112, 32), int32] */;
  %76 = right_shift(%75, 7 /* ty=int32 */) /* ty=Tensor[(1, 112, 112, 32), int32] */;
  %77 = clip(%76, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 112, 112, 32), int32] */;
  %78 = multiply(%77, meta[relay.Constant][9] /* ty=Tensor[(32), int32] */ /* ty=Tensor[(32), int32] */) /* ty=Tensor[(1, 112, 112, 32), int32] */;
  %79 = add(%78, meta[relay.Constant][10] /* ty=Tensor[(32), int32] */ /* ty=Tensor[(32), int32] */) /* ty=Tensor[(1, 112, 112, 32), int32] */;
  %80 = clip(%79, a_min=0f, a_max=192f) /* ty=Tensor[(1, 112, 112, 32), int32] */;
  %81 = @tinyai_2(%80) /* ty=Tensor[(1, 112, 112, 32), int8] */;
  %82 = annotation.stop_fusion(%81) /* ty=Tensor[(1, 112, 112, 32), int8] */;
  %83 = @tinyai_3(%82) /* ty=Tensor[(1, 112, 112, 32), int32] */;
  %84 = add(%83, 2 /* ty=int32 */) /* ty=Tensor[(1, 112, 112, 32), int32] */;

For my BYOC backend, I am using TVMs Quantization, followed by composite pattern matching, annotation, merging and partitioning. In the above relay description, which represents part of the partitioned MobileNetV1 (converted from Keras Model), are a couple of annotation.stop_fusion statements, that separate subgraphs of my custom backend (as in line %81 to %83).

It seems, like they are introduced in the annotation step and usually separate the individual layers of the model.