Hi I want to deploy the BERT-base model on an Android phone. One of its params has shape (30522, 768) with dtype float32, the RPC connection will be reset each time I allocate this array.
for pk, pv in params.items():
print(pv.shape, pv.dtype)
weights[pk] = tvm.nd.array((np.random.uniform(size=pv.shape)).astype(pv.dtype), ctx=ctx)
The error message:
Traceback (most recent call last):
File "tune_network_x86.py", line 483, in <module>
tune_network()
File "tune_network_x86.py", line 423, in tune_network
weights[pk] = tvm.nd.array((np.random.uniform(size=pv.shape)).astype(pv.dtype), ctx=ctx)
File "/home/zyx/workspaces/python/tvm0.8_v2/python/tvm/runtime/ndarray.py", line 516, in array
return empty(arr.shape, arr.dtype, ctx).copyfrom(arr)
File "/home/zyx/workspaces/python/tvm0.8_v2/python/tvm/runtime/ndarray.py", line 154, in copyfrom
check_call(_LIB.TVMArrayCopyFromBytes(self.handle, data, nbytes))
File "/home/zyx/workspaces/python/tvm0.8_v2/python/tvm/_ffi/base.py", line 344, in check_call
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (6) /home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(TVMArrayCopyFromBytes+0xe) [0x7f097dcf53ae]
[bt] (5) /home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::ArrayCopyFromBytes(DLTensor*, void const*, unsigned long)+0x2c9) [0x7f097dcf52e9]
[bt] (4) /home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::RPCDeviceAPI::CopyDataFromTo(void const*, unsigned long, void*, unsigned long, unsigned long, DLContext, DLContext, DLDataType, void*)+0x346) [0x7f097dd265b6]
[bt] (3) /home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::RPCEndpoint::CopyToRemote(void*, unsigned long, void*, unsigned long, unsigned long, DLContext, DLDataType)+0x75d) [0x7f097dd2a4cd]
[bt] (2) /home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::RPCEndpoint::HandleUntilReturnEvent(bool, std::function<void (tvm::runtime::TVMArgs)>)+0x1a5) [0x7f097dd28955]
[bt] (1) /home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(tvm::runtime::SockChannel::Send(void const*, unsigned long)+0xb8) [0x7f097dd490b8]
[bt] (0) /home/zyx/workspaces/python/tvm0.8_v2/build/libtvm.so(+0x1bc2838) [0x7f097dd44838]
File "/home/zyx/workspaces/python/tvm0.8_v2/src/runtime/rpc/../../support/socket.h", line 360
TVMError: Socket SockChannel::Send Error:连接被对方重设
The BERT model was imported from Torch
model_class = transformers.BertModel
tokenizer_class = transformers.BertTokenizer
# Better to download them manualy
# https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin
# https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt
# https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json
# Then rename to pytorch_model.bin, vocab.txt & config.json
# weight = 'path to downloaded model dir'
weight = '/home/zyx/.torch/hub/bert-base-uncased'
model = model_class.from_pretrained(weight)
model = ModelWrapper(model)
model.eval()
# tokenizer = tokenizer_class.from_pretrained(weight)
# A = torch.tensor([tokenizer.encode("Here is some text to encode", add_special_tokens=True)])
# There is 30522 words in bert-base-uncased's vocabulary list
input_shape = [batch_size, 128]
input_name = 'input_ids'
input_dtype = 'int64'
A = torch.randint(30000, input_shape)
scripted_model = torch.jit.trace(model, [A])
shape_list = [('input_ids', input_shape)]
mod, params = relay.frontend.from_pytorch(scripted_model, shape_list)
mod = optimize_bert(mod, params)
The optimize_bert
function has the following passes:
new_mod = FastSoftmax(mod)
new_mod = ShapeConstDedup(new_mod)
new_mod = tvm.relay.transform.EliminateCommonSubexpr()(new_mod)
BindPass = tvm.relay.transform.function_pass(lambda fn, new_mod, ctx:
tvm.relay.build_module.bind_params_by_name(fn, params), opt_level=1)
new_mod = BindPass(new_mod)
new_mod = tvm.relay.transform.FoldConstant()(new_mod)
new_mod = tvm.relay.transform.CombineParallelBatchMatmul()(new_mod)
# new_mod = tvm.relay.transform._ffi_api.BatchMatmulWeightTranspose()(new_mod)
new_mod = tvm.relay.transform.FoldConstant()(new_mod)
ret_list.append(new_mod)
I also tried the commit [RPC][BUGFIX][BACKPORT-0.6] Fix bug in rpc ring buffer shrink by tqchen · Pull Request #5516 · apache/tvm · GitHub for ring_buffer.h
, but didn’t work.