BYOC for ARM Ethos-N Fails

guanjen375 · October 25, 2022, 1:12am

Thanks a lot. After these tips, I could run with ethosn and llvm. Rather, after I saw the graph.json from the module.tar, I found there is no computing operator using ethos-n. I think that is because of the model I used, I should change a model to test for my ethos-n. Is that right?

Here is my code which can run successfully.

Here is the model I used now: https://github.com/onnx/models/raw/b9a54e89508f101a1611cd64f4ef56b9cb62c7cf/vision/classification/resnet/model/resnet50-v2-7.onnx

lhutton1 · October 25, 2022, 8:03am

Hi, yes it looks like the model you’re using is in NCHW format which is not supported. You could try to convert the model to NHWC by adding desired_layout="NHWC" to the compile function. This should insert layout_transform operations where necessary so that the graph can be offloaded to the NPU.

guanjen375 · October 26, 2022, 11:07am

I have used several different codes for testing. Rather, I found I have never used ethos-n for running. My code is as follows:

from tvm.driver import tvmc
model = tvmc.load('my_model.onnx')
print(model.summary())
package = tvmc.compile(model,target="ethos-n -variant=n78, llvm",dump_code="relay",package_path = "module.tar",desired_layout="NHWC")
result = tvmc.run(package, device="cpu")
print(result.outputs)

The output relay graph is as follow(part):(I found there is no use for ethos-n)

My Ethos-n is at /dev/ethosn0. I found tvmc.run did not run it anymore.

(I found even I deleted /dev/ethosn0, the runtime executes withour error message)
Everytime I tested for the model, I got the different result.

lhutton1 · October 27, 2022, 8:34am

Apologies for missing this before, it looks like the graph you’re providing is in float32 format. Only int8 and uint8 formats can be offloaded to the NPU, so this explains why no operations are being offloaded

guanjen375 · October 27, 2022, 8:46am

Thanks a lot. I will test for other model now.

guanjen375 · November 7, 2022, 3:31am

I have tested my mobilenet model for cpu and cpu+npu using tvmc, but the result is still failed.

(Not the same between cpu and cpu+npu)

My code is here: https://github.com/guanjen375/EthosN-tvmc

By running run_llvm.sh, you can run the mobilenet model with cpu. (target = “llvm”)

By running run_ethosn.sh, you can run the mobilenet model with cpu and npu. (target = "ethos-n -variant=n78, “llvm”)

The result is also shown when running. You can see two result is different.

How should I do to make it consistent ?

lhutton1 · November 7, 2022, 1:47pm

Thanks @guanjen375, I believe I was able to reproduce the mismatch: 232 vs 231, 1 vs 2? This can simply be attributed to differences in rounding behaviour of both the LLVM backend and the NPU integration

guanjen375 · November 7, 2022, 2:23pm

What you mean is as follows:

cpu result : [[0 0 231 … 0 0 0]] (source run_llvm.sh)

cpu+npu result : [[0 0 232 … 0 0 0]] (source run_ethosn.sh)

Is that right?

Here is my result:

cpu result : [[0 0 231 … 0 0 0]

npu result : [[0 0 0 … 0 0 0]]

If what is correct above, can you provide the tvm edition you use and npu library edition?

lhutton1 · November 7, 2022, 2:47pm

Yes that’s correct, your first example is what I got when running the provided scripts. I’m using the latest main f2a740331f21106787a29566185d8924e5dcb25a and the NPU driver stack version 22.08.

It seems as though something could be incorrect with your setup. Just to confirm, does this occur with other networks/operators as well or just the network you’re trying here (mobilenet v2)? It might be worth trying out just the average pool you tested previously to make sure that gives the expected result.

guanjen375 · November 7, 2022, 3:00pm

Thanks a lot. I will check for my setup.

guanjen375 · November 8, 2022, 11:04am

After I follow the tips from here: https://github.com/apache/tvm/issues/13191

I can run using my python file:

I also want to run the model using tvmc with cpu and npu.

Do you know how to modify my code to achieve?

The Original Code

python3 -m tvm.driver.tvmc compile --target=“ethos-n -variant=n78, llvm”

lhutton1 · November 8, 2022, 5:14pm

That’s good news, I believe the issue could be due to the TVMC target string not being specific enough for the variant you’re compiling for. If you’re compiling for the 4TOPS_4PLE_RATIO variant, you can change the target string to be ethos-n -variant=n78 -tops=4 -ple_ratio=4, llvm, hopefully that helps

guanjen375 · November 9, 2022, 1:23am

Thanks a lot. After using the instruction target = “ethos-n -variant=n78 -tops=4 -ple_ratio=4, llvm”

I can run with tvmc or without tvmc on cpu and npu correctly in mobilenet.

guanjen375 · November 10, 2022, 1:37am

I can now run mobilenet and inceptionet with cpu and npu correctly.(test for 1000 pictures)

Nevertheless, when I want to run resnet with cpu and npu, I get failed.

I found when I dispatch fully connected to cpu, the result is correct.

The code at /tvm/python/tvm/relay/op/contrib/ethosn.py

Is there anything computing difference between cpu and npu with fully connected?

If there is not, I think I should check for my ethos-n setup just as before.

Besides, this is my resnet model: resnet50_uint8_tf2.1_20200911_quant.tflite - Google Drive

lhutton1 · November 10, 2022, 8:56am

From the snippet you show, it looks as though a slightly outdated version of TVM is being used, so I suspect that you don’t have this patch which fixes the weight transformation in fully connected: https://github.com/apache/tvm/pull/12970

guanjen375 · November 10, 2022, 10:19am

l have tested with following branch.

Unfortunately, It got failed.

Can I use the newest tvm for alternative?

guanjen375 · November 11, 2022, 1:36am

I have updated my tvm to the newest edition.

However, when I run the model. I got the error message as follows:

Thus, I modified the file /home/sunplus/project/tvm/relay/op/contrib/ethosn.py as follows:

After the modified, I could run the model with cpu and npu successfully.

Resnet can also be run.

Do you have any comment with this modification?

lhutton1 · November 11, 2022, 11:05am

That’s great to hear, with later versions of TVM we don’t officially support the 22.05 (3.0.1) version of the NPU driver stack, hence observing this error. Although, it seems that it works for your use case, so that should be okay for now

guanjen375 · November 16, 2022, 11:53am

I have tested successfully the model mobile net / inceptionet / vgg-16 / resnet / squeeze net with cpu and npu using tvmc. Unfortunately, when I tried to test yolo, I got failed when compiling ( I have also tested for run with cpu only and run successfully. )

Here is my code: https://github.com/guanjen375/tvmc_debug

The error message:

I have already used the newest tvm.

tvm edition: 5364e5a39a5e33728b7f5a26ddb40543a544ea02

lhutton1 · November 16, 2022, 2:59pm

Hi @guanjen375, I just gave this a try and can reproduce the issue. It seems as though it is occurring while compiling for qnn.add. Although not perfect, perhaps you could try with qnn.add offloading commented out in the pattern table? The yolo model you have seems a bit different from the one I’ve tested with in the past.

I’ll be away for a few days now but I’ll dig into this when I get back, apologies for the inconvenience.