TVM inference BERT results are inconsistent every time

Hello,@ masahi As the title says, I took the model from zenodo and used the following code to export torchscript

model = BertForQuestionAnswering(config)
model.to(dev)
model.eval()
model_file = "/model/bert/model.pytorch"
model.load_state_dict(torch.load(model_file), strict=False)
inputs = [
    torch.ones((64, 384), dtype=torch.int64),
    torch.ones((64, 384), dtype=torch.int64),
    torch.ones((64, 384), dtype=torch.int64),
]
scripted_model = torch.jit.trace(model, inputs).eval()
torch.jit.save(scripted_model, "bert.jit")

When I use the jit.load model and use from_pytorch to import TVM, the final inference result is inconsistent every time, and it is different from the direct inference result of torch. What is the reason for this, thank you!

jit_model = torch.jit.load(model_path, map_location="cpu")

with torch.no_grad():
    start_pos, end_pos = jit_model(tokens_tensor, segments_tensors, mask_tensors)
 

input_shapes = [
    ("input_ids", ([64, 384], "int64")),
    ("attention_mask", ([64, 384], "int64")),
    ("token_type_ids", ([64, 384], "int64")),
]


mod, params = relay.frontend.from_pytorch(jit_model, input_shapes)

with tvm.transform.PassContext(opt_level=3):
    lib = relay.build(mod, target=target, params=params)
dev = tvm.device(str(target), 0)
module = graph_executor.GraphModule(lib["default"](dev))

input_ids = tvm.nd.array(np.array(input_ids_list).astype(dtype))
segment_ids = tvm.nd.array(np.array(segment_ids_list).astype(dtype))
input_mask = tvm.nd.array(np.array(input_mask_list).astype(dtype))

module.set_input(input_ids=input_ids, input_mask=input_mask, segment_ids=segment_ids)
module.run()
tvm_start_pos = module.get_output(0).numpy()
tvm_end_pos = module.get_output(1).numpy()

I’m downloading the model now (it’s so slow…). Can you verify that the pytorch model doesn’t have random op like dropout?

I’m not sure, I got that model directly from there, but there is a strange problem: if I don’t use torch.jit.save to save it, but directly from_pytorch The script above, the inference results of TVM and pytorch are both same, when I save the bert.jit model, I find that TVM and pytorch inference are different, and the inference results of TVM are changing every time. Is it because TVM needs to turn off random initialization? I haven’t tried it before. thank you!

Please make sure that the inference result of PyTorch alone, after save / load, is consistent across different invocations. If it is a PyTorch issue, we cannot do anything about it.

Please show your config to initialize BertForQuestionAnswering.

The PyTorch inference results are the same in both cases 1) bert.pt -> model.load_state_dict(torch.load(model_file), strict=False) -> torch.jit.trace(model, inputs).eval(); 2) bert.jit -> torch.jit.load, But the two inferences of TVM are different. The first case is the same as PyTorch, and the second one has different results every time it runs. I hope I explained, thanks a lot :smiley:

config = BertConfig( attention_probs_dropout_prob=0.1, hidden_act=“gelu”, hidden_dropout_prob=0.1, hidden_size=1024, initializer_range=0.02, intermediate_size=4096, max_position_embeddings=512, num_attention_heads=16, num_hidden_layers=24, type_vocab_size=2, vocab_size=30522, )

Please post a complete runnable script. Your code snippet is full of undefined variables.

Hello @masahi, I have located the problem. Due to carelessness, the three input names of input_shapes in the above code are different from the input names of module.set_inputs,… but I think that if they are different, an error should be reported, and why does the result come out ! Thank you for your answer above, sorry for the inconvenience