Pipeling the graph and use tvm.contrib.pipeline_executer VS. using relay.transorm.PartitionGrpah

Hi all,

I want to convert the computational graph to subgraphs and then apply the optimizations. So far, I’ve seen two methods:

  1. Pipelining: from GitHub

         remove_bn_pass = tvm.transform.Sequential(
             [
                 transform.InferType(),
                 transform.SimplifyInference(),
                 transform.FoldConstant(),
                 transform.FoldScaleAxis(),
             ]
         )
         composite_partition = tvm.transform.Sequential(
             [
                 remove_bn_pass,
                 transform.MergeComposite(pattern_table),
                 transform.AnnotateTarget("dnnl"),
                 transform.PartitionGraph(bind_constants=bind_constants),
             ]
         )
    
     with tvm.transform.PassContext(opt_level=3, disabled_pass=["AlterOpLayout"]):
         return composite_partition(mod)
    
  2. Relay.transform.PartitionGrpah: from GitHub

I’m not sure what the difference is between relay.transform.PartitionGraph (partitioning the passes into regions for execution on different backends) and pipelining. In pipelining, are the partitions always interdependent?

I am learning TVM and would appreciate it if anyone could help me find the answer to my question.