Model File is extremely large tuning through TVM Auto-Scheduler

I tuned a original pytorch trace model and it’s 120 MB. And the tvm model file tuning through auto_scheduler.TaskScheduler is 3.2GB and has only 2ms acceleration. I wonder if this is common?

My model is fp32 and device Nvidia 2070. Thank you for any ideas.

This is a little confusing. TVM in general shouldn’t be changing the size of models and the tuning process especially doesn’t have an impact as it just produces a log file used in compilation. Do you have any python snippets that show how you’re producing this much larger model?

Sure. My model is Swin Transformer. Open source code Swin Transformer Model. Here are Auto-Scheduler Tuning Code.

import torch
from swin_transformer import SwinTransformer
import tvm
from tvm import relay, auto_scheduler

with torch.no_grad():
    model = SwinTransformer(img_size=224, in_chans=3, embed_dim=96, depths=[2, 3, 6, 2],
            num_heads=[3, 6, 12, 24], window_size=7, drop_path_rate=0.2,num_classes = 13).float().cuda().eval()
    shape = [192, 3, 224, 224]
    input0 = torch.ones(shape).float().cuda()
    trace = torch.jit.trace(model,input0)
    torch.jit.save(trace,'st_v1.trace')
    relay_model, params = relay.frontend.from_pytorch(trace, [('input0',input0.shape)], default_dtype='float32')
    target = tvm.target.cuda()
    tasks, task_weights = auto_scheduler.extract_tasks(relay_model["main"], params, target)
    measure_ctx = auto_scheduler.LocalRPCMeasureContext(repeat=1,
                                                        min_repeat_ms=100,
                                                        timeout=100)

    tuner = auto_scheduler.TaskScheduler(tasks,
                                         task_weights,
                                         load_model_file='st_v1',
                                        )
    tune_option = auto_scheduler.TuningOptions(
        num_measure_trials=36000,
        num_measures_per_round=64,
        early_stopping=500,
        verbose=True,
        runner=measure_ctx.runner,
        measure_callbacks=[auto_scheduler.RecordToFile('st_v1.log')],
    )
    tuner.tune(tune_option)
    with auto_scheduler.ApplyHistoryBest("st_v1.log"):
        with tvm.transform.PassContext(opt_level=3, config={"relay.backend.use_auto_scheduler": True}):
            lib = relay.build(relay_model, target=target, params=params)

    lib.export_library('st_v1.so')

Well, I found that the model file built from relay is still 3.2 GB without tuning process, in contrast with torch trace model 110MB. I guess there’s problem during task extracting.