Hi, I am trying to import a scripted quant-MobileBERT torch model via the
relay.frontend.from_pytorch API, which works fine for quantized vision models like resnet50. However, I met the following problem (maybe due to the complexity of BERT model):
When it comes to
4171: input_scales_for_bias = qnn_torch.add_input_quant_params_to_op_inputs(graph)
490: if "quantized::conv" in operator or "quantized::linear" in operator: # This is required for quantizing the bias assert len(input_scales) == 1, "One quantized parameter expected for qconv or qlinear." input_scales_for_bias[node.inputsAt(1).debugName()] = input_scales.node().f("value")
input_scales.node() to be something like:
%493 : float = prim::Constant[value=0.01865844801068306]()
(The case in quantized resnet50)
Where it can directly pull out the value of the scale since it is a constant.
However in the MobileBERT case
input_scales.node() is actually something like:
%cat_output_scale_0.1 : Tensor = prim::GetAttr[name="cat_output_scale_0.1"](%self.1)
It is not a constant straight out of the box and needs to call the function to get the scale attribute from %self.1( AKA the script module). Thus a error will be thrown out since there’s actually no “value” key attached to
Actually, I tried to hack this by passing in the script_module and get the scale constant by hand via something like
float(getattr(script_module,func_name)) and it does eliminates the errors here temporarily. However this is not a complete solution and I still get more errors from the following steps since some of the nodes cannot be recognized/parsed correctly.
Any suggestions for this problem? Thanks : )