BYOC for ARM Ethos-N Fails

guanjen375 · November 7, 2022, 3:00pm

Thanks a lot. I will check for my setup.

guanjen375 · November 8, 2022, 11:04am

After I follow the tips from here: [Bug] Running test_conv2d.py got error output in Ethos-N78 platform. · Issue #13191 · apache/tvm · GitHub

I can run using my python file:

I also want to run the model using tvmc with cpu and npu.

Do you know how to modify my code to achieve?

The Original Code

python3 -m tvm.driver.tvmc compile --target=“ethos-n -variant=n78, llvm”

lhutton1 · November 8, 2022, 5:14pm

That’s good news, I believe the issue could be due to the TVMC target string not being specific enough for the variant you’re compiling for. If you’re compiling for the 4TOPS_4PLE_RATIO variant, you can change the target string to be ethos-n -variant=n78 -tops=4 -ple_ratio=4, llvm, hopefully that helps

guanjen375 · November 9, 2022, 1:23am

Thanks a lot. After using the instruction target = “ethos-n -variant=n78 -tops=4 -ple_ratio=4, llvm”

I can run with tvmc or without tvmc on cpu and npu correctly in mobilenet.

guanjen375 · November 9, 2022, 4:48pm

I can now run mobilenet and inceptionet with cpu and npu correctly.(test for 1000 pictures)

Nevertheless, when I want to run resnet with cpu and npu, I get failed.

I found when I dispatch fully connected to cpu, the result is correct.

The code at /tvm/python/tvm/relay/op/contrib/ethosn.py

Is there anything computing difference between cpu and npu with fully connected?

If there is not, I think I should check for my ethos-n setup just as before.

Besides, this is my resnet model: https://drive.google.com/file/d/1dNDq7sgUpsK4QsTGBEgSYbozlPq2Zr9c/view?usp=share_link

lhutton1 · November 10, 2022, 8:56am

From the snippet you show, it looks as though a slightly outdated version of TVM is being used, so I suspect that you don’t have this patch which fixes the weight transformation in fully connected: https://github.com/apache/tvm/pull/12970

guanjen375 · November 10, 2022, 9:50am

l have tested with following branch.

Unfortunately, It got failed.

Can I use the newest tvm for alternative?

guanjen375 · November 11, 2022, 1:36am

I have updated my tvm to the newest edition.

However, when I run the model. I got the error message as follows:

Thus, I modified the file /home/sunplus/project/tvm/relay/op/contrib/ethosn.py as follows:

After the modified, I could run the model with cpu and npu successfully.

Resnet can also be run.

Do you have any comment with this modification?

lhutton1 · November 11, 2022, 11:05am

That’s great to hear, with later versions of TVM we don’t officially support the 22.05 (3.0.1) version of the NPU driver stack, hence observing this error. Although, it seems that it works for your use case, so that should be okay for now

guanjen375 · November 16, 2022, 11:53am

I have tested successfully the model mobile net / inceptionet / vgg-16 / resnet / squeeze net with cpu and npu using tvmc. Unfortunately, when I tried to test yolo, I got failed when compiling ( I have also tested for run with cpu only and run successfully. )

Here is my code: GitHub - guanjen375/tvmc_debug

The error message:

I have already used the newest tvm.

tvm edition: 5364e5a39a5e33728b7f5a26ddb40543a544ea02

lhutton1 · November 16, 2022, 2:59pm

Hi @guanjen375, I just gave this a try and can reproduce the issue. It seems as though it is occurring while compiling for qnn.add. Although not perfect, perhaps you could try with qnn.add offloading commented out in the pattern table? The yolo model you have seems a bit different from the one I’ve tested with in the past.

I’ll be away for a few days now but I’ll dig into this when I get back, apologies for the inconvenience.

guanjen375 · November 16, 2022, 3:13pm

Thanks for your help. The yolo model is from my teammates, I will check with him and test what you mentined.

lhutton1 · November 24, 2022, 1:41pm

Hi @guanjen375 I took a further look into this and the issue seems to be caused by an add operation where one input is a constant value. Due to the data in this constant value it cannot be converted to either a depth-wise or requantize operation (see https://github.com/apache/tvm/blob/main/python/tvm/relay/op/contrib/ethosn.py#L390) which expects one input to be constant. Instead, it gets converted remains as a standard add operation but the constant value doesn’t get handled correctly in the codegen. I’m hoping to get around to fixing this in the next few days