How to efficiently copy an IR

ganler · August 16, 2021, 2:25am

Hi all, I wonder how to copy an IR in tvm:

What I tried:

deepcopy: not working. will produce error for some IR.
astext + fromtext: will work but kinda slow (2s for big models).

Thanks!

junrushao · August 16, 2021, 6:45am

In terms of deep copy, the most efficient way is SaveJSON and LoadJSON >_<

ganler · August 16, 2021, 7:42am

@junrushao Thank you Junru! It did become faster!

Another related question is: will passes alter the original module (i mean if mod will be modified after applying new_mod = pass(mod)). It seems this is the case and I have to do the copy thing manually if I want to maintain the old module (I thought it would be COW or something else).

# we have a variable `mod` which is a module

old_mod_str = mod.astext()

_ = ToBasicBlockNormalForm()(mod)

assert old_mod_str == mod.astext() # will fail

To reproduce:

from tvm import relay
import tvm
from tvm.relay import testing, transform
import hashlib

def example(batch_dim=1):
    out_channels = 32

    data = relay.var("data", relay.TensorType((batch_dim, 3, 224, 224), "float32"))
    weight = relay.var("weight")

    simple_net = relay.nn.conv2d(
        data=data, weight=weight, kernel_size=(5, 5), channels=out_channels, padding=(1, 1)
    )
    simple_net = relay.Function(relay.analysis.free_vars(simple_net), simple_net)

    return testing.create_workload(simple_net)


def md5(data):
    return hashlib.md5(data).hexdigest()

mod, params = example()

print(f'before apply pass, md5 of mod: {md5(mod.astext().encode())}')
print(mod.astext())
with tvm.transform.PassContext(opt_level=4):
    with tvm.target.Target('llvm'):
        seq = tvm.transform.Sequential(
            passes=[transform.ToBasicBlockNormalForm()],
            opt_level=4
        )
        new_mod = seq(mod)

print(f'after apply pass, md5 of mod: {md5(mod.astext().encode())}')
print(f'after apply pass, md5 of new mod: {md5(new_mod.astext().encode())}')

print(mod.astext())

before apply pass, md5 of mod: 383f47fec6c1b1ad607b5e66671602f0
#[version = "0.0.5"]
def @main(%data: Tensor[(1, 3, 224, 224), float32], %weight: Tensor[(32, 3, 5, 5), float32]) -> Tensor[(1, 32, 222, 222), float32] {
  nn.conv2d(%data, %weight, padding=[1, 1, 1, 1], channels=32, kernel_size=[5, 5]) /* ty=Tensor[(1, 32, 222, 222), float32] */
}

after apply pass, md5 of mod: 207a065e002a2e9dcc3873ff51059394
after apply pass, md5 of new mod: 207a065e002a2e9dcc3873ff51059394
#[version = "0.0.5"]
def @main(%data: Tensor[(1, 3, 224, 224), float32], %weight: Tensor[(32, 3, 5, 5), float32]) -> Tensor[(1, 32, 222, 222), float32] {
  nn.conv2d(%data, %weight, padding=[1, 1, 1, 1], channels=32, kernel_size=[5, 5])
}

ganler · August 16, 2021, 7:47am

This case seems to occur when we apply the basic block form pass. Other passes, e.g., FuseOps will not result in such results. So I am wondering if it is expected to modify the input argument (old module) by some passes.

junrushao · August 16, 2021, 8:20am

IIRC we made some immutability assumptions here that the passes won’t modify the original IRModule. We did find some bugs in the codebase previously the module is incorrectly modified though