Thanks @guanjen375,
To answer the question about average pool, a difference of 1 can be expected. This is because TVM’s schedules can apply slightly different rounding when compared to the NPU for quantized types. In fact, all of our operator tests are compared within a tolerance of 1 for this reason e.g. for average pool see: https://github.com/apache/tvm/blob/main/tests/python/contrib/test_ethosn/test_pooling.py#L68.
For resnet, thanks for sending the example across. Without seeing the uint8 variant of resnet you seem to be using I can’t say for sure what the issue is, but I’ll try to offer some suggestions. When loading a model using the from_tflite
mechanism, parameters (constants) are not bound to the module and are instead treated similar to a variable input. The partitioning relies on these parameters being bound to the main function for pattern matching to work as intended, but in your example this isn’t the case. I’d recommend taking a look at my suggestion here: Constant params should be constants - #3 by lhutton1, and applying it before running the merge composite pass.
Alternatively (and perhaps a more elegant approach), is that we have a convenient function that manages the partitioning pipeline called partition_for_ethosn(...)
. I’d suggest using this rather than building your own pipeline as it allows us to add more optimization passes to the partitioning in the future. It could be dropped into your example similar to:
from tvm.relay.op.contrib.ethosn import partition_for_ethosn
mod, params = relay.frontend.from_tflite(model, shape_dict, dtype_dict)
mod = partition_for_ethosn(mod, params, variant="n78")
with tvm.transform.PassContext(...):
relay.build(...)
Note: I didn’t run this snippet, but hopefully it’s enough to get you started 
It seems with your current example not many operations will be partitioned for the NPU and instead run on the CPU, so it might be the case that the segfault comes within this lowering process towards CPU.
Hope this helps!
Edit: I would also recommend taking a look at the user-facing TVMC Python interface, which should automatically take care of these types of issues.