TVM inference BERT results are inconsistent every time

misto · November 10, 2022, 1:03am

Hello,@ masahi As the title says, I took the model from zenodo and used the following code to export torchscript

model = BertForQuestionAnswering(config)
model.to(dev)
model.eval()
model_file = "/model/bert/model.pytorch"
model.load_state_dict(torch.load(model_file), strict=False)
inputs = [
    torch.ones((64, 384), dtype=torch.int64),
    torch.ones((64, 384), dtype=torch.int64),
    torch.ones((64, 384), dtype=torch.int64),
]
scripted_model = torch.jit.trace(model, inputs).eval()
torch.jit.save(scripted_model, "bert.jit")

When I use the jit.load model and use from_pytorch to import TVM, the final inference result is inconsistent every time, and it is different from the direct inference result of torch. What is the reason for this, thank you!

jit_model = torch.jit.load(model_path, map_location="cpu")

with torch.no_grad():
    start_pos, end_pos = jit_model(tokens_tensor, segments_tensors, mask_tensors)
 

input_shapes = [
    ("input_ids", ([64, 384], "int64")),
    ("attention_mask", ([64, 384], "int64")),
    ("token_type_ids", ([64, 384], "int64")),
]


mod, params = relay.frontend.from_pytorch(jit_model, input_shapes)

with tvm.transform.PassContext(opt_level=3):
    lib = relay.build(mod, target=target, params=params)
dev = tvm.device(str(target), 0)
module = graph_executor.GraphModule(lib["default"](dev))

input_ids = tvm.nd.array(np.array(input_ids_list).astype(dtype))
segment_ids = tvm.nd.array(np.array(segment_ids_list).astype(dtype))
input_mask = tvm.nd.array(np.array(input_mask_list).astype(dtype))

module.set_input(input_ids=input_ids, input_mask=input_mask, segment_ids=segment_ids)
module.run()
tvm_start_pos = module.get_output(0).numpy()
tvm_end_pos = module.get_output(1).numpy()

masahi · November 10, 2022, 6:05am

I’m downloading the model now (it’s so slow…). Can you verify that the pytorch model doesn’t have random op like dropout?

misto · November 10, 2022, 6:24am

I’m not sure, I got that model directly from there, but there is a strange problem: if I don’t use torch.jit.save to save it, but directly from_pytorch The script above, the inference results of TVM and pytorch are both same, when I save the bert.jit model, I find that TVM and pytorch inference are different, and the inference results of TVM are changing every time. Is it because TVM needs to turn off random initialization? I haven’t tried it before. thank you!

masahi · November 10, 2022, 6:28am

Please make sure that the inference result of PyTorch alone, after save / load, is consistent across different invocations. If it is a PyTorch issue, we cannot do anything about it.

masahi · November 10, 2022, 6:53am

Please show your config to initialize BertForQuestionAnswering.

misto · November 10, 2022, 7:02am

The PyTorch inference results are the same in both cases 1) bert.pt -> model.load_state_dict(torch.load(model_file), strict=False) -> torch.jit.trace(model, inputs).eval(); 2) bert.jit -> torch.jit.load, But the two inferences of TVM are different. The first case is the same as PyTorch, and the second one has different results every time it runs. I hope I explained, thanks a lot

misto · November 10, 2022, 7:04am

config = BertConfig( attention_probs_dropout_prob=0.1, hidden_act=“gelu”, hidden_dropout_prob=0.1, hidden_size=1024, initializer_range=0.02, intermediate_size=4096, max_position_embeddings=512, num_attention_heads=16, num_hidden_layers=24, type_vocab_size=2, vocab_size=30522, )

masahi · November 10, 2022, 7:34am

Please post a complete runnable script. Your code snippet is full of undefined variables.

misto · November 10, 2022, 8:53am

Hello @masahi, I have located the problem. Due to carelessness, the three input names of input_shapes in the above code are different from the input names of module.set_inputs,… but I think that if they are different, an error should be reported, and why does the result come out ! Thank you for your answer above, sorry for the inconvenience