def @main(%input_1: Tensor[(1, 224, 224, 3), float32]) -> Tensor[(1, 1000), float32] {
%69 = nn.pad(%input_1, pad_width=[[0, 0], [0, 1], [0, 1], [0, 0]]) /* ty=Tensor[(1, 225, 225, 3), float32] */;
%70 = multiply(%69, 16f /* ty=float32 */) /* ty=Tensor[(1, 225, 225, 3), float32] */;
%71 = round(%70) /* ty=Tensor[(1, 225, 225, 3), float32] */;
%72 = clip(%71, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 225, 225, 3), float32] */;
%73 = cast(%72, dtype="int8") /* ty=Tensor[(1, 225, 225, 3), int8] */;
%74 = @tinyai_0(%73) /* ty=Tensor[(1, 112, 112, 32), int32] */;
%75 = add(%74, 64 /* ty=int32 */) /* ty=Tensor[(1, 112, 112, 32), int32] */;
%76 = right_shift(%75, 7 /* ty=int32 */) /* ty=Tensor[(1, 112, 112, 32), int32] */;
%77 = clip(%76, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 112, 112, 32), int32] */;
%78 = multiply(%77, meta[relay.Constant][9] /* ty=Tensor[(32), int32] */ /* ty=Tensor[(32), int32] */) /* ty=Tensor[(1, 112, 112, 32), int32] */;
%79 = add(%78, meta[relay.Constant][10] /* ty=Tensor[(32), int32] */ /* ty=Tensor[(32), int32] */) /* ty=Tensor[(1, 112, 112, 32), int32] */;
%80 = clip(%79, a_min=0f, a_max=192f) /* ty=Tensor[(1, 112, 112, 32), int32] */;
%81 = @tinyai_2(%80) /* ty=Tensor[(1, 112, 112, 32), int8] */;
%82 = annotation.stop_fusion(%81) /* ty=Tensor[(1, 112, 112, 32), int8] */;
%83 = @tinyai_3(%82) /* ty=Tensor[(1, 112, 112, 32), int32] */;
%84 = add(%83, 2 /* ty=int32 */) /* ty=Tensor[(1, 112, 112, 32), int32] */;
For my BYOC backend, I am using TVMs Quantization, followed by composite pattern matching, annotation, merging and partitioning. In the above relay description, which represents part of the partitioned MobileNetV1 (converted from Keras Model), are a couple of annotation.stop_fusion statements, that separate subgraphs of my custom backend (as in line %81 to %83).
It seems, like they are introduced in the annotation step and usually separate the individual layers of the model.