Op fusion issue with layout_transform followed by bitpack

Hi Everyone,

I am trying to build the below graph

def @main(%data: Tensor[(32, 64, 224, 224), float32]) -> Tensor[(32, 224, 224, 2, 8), uint8] {
  %0 = cast(%data, dtype="int16") /* ty=Tensor[(32, 64, 224, 224), int16] */;
  %1 = layout_transform(%0, src_layout="NCHW", dst_layout="NHWC") /* ty=Tensor[(32, 224, 224, 64), int16] */;
  nn.bitpack(%1, bits=2, pack_axis=3, bit_axis=3, pack_type="uint8", name="bitpack") /* ty=Tensor[(32, 224, 224, 2, 8), uint8] */
}

with target = tvm.target.Target("llvm -device=arm_cpu -mtriple=aarch64-linux-gnu")

But it throws an error:

Check failed: compute->body.size() == 1U (2 vs. 1) : can only inline compute op with 1 output

Is there any error in the graph? thanks

When I am using (layout_transform + bitpack) it throws above error, but If I swap those layers as (bitpack + layout_transform), the fusion works fine.

To make the first works, I required to comment https://github.com/apache/tvm/blob/bf65b396c15b3cbec18fb1aecfa6862f58a2f307/src/relay/transforms/fuse_ops.cc#L789 this line to stop fusing Injective ops or insert a stop_fusion node between those ops. did not get the reason why? any help. Thanks