[CMSIS-NN] Unused variables taking up memory

Been playing around with the CMSIS-NN implementation, and I noticed that it was taking up more flash than TFLM. I found that the reason is the unused variables that stem from the quantization of conv2d, namely the input scale and the weight/filter scale. While they are originally part of the conv2d call, they are not being used and are “replaced” by a shift and a multiplier respectively when creating the function call (see https://github.com/apache/tvm/pull/11431 @ashutosh-arm), but their data buffers still remain.

Here is an image showing some of the unused buffers that I am experiencing (marked with red): imageedit_2_7813497524

How would one go about removing these?

I don’t think I have an answer to your question. Here is the problem which is causing this extra allocation. Since these constants are codegen into the global workspace by codegenC, this suggests as long as the TIR has constants, those will be allocated space under rodata.tvm. TIR gets it from fixed number of arguments in Relay ops. Since qnn.conv2d's arguments cannot be altered in Relay or RelayToTIR, it leaves little space for ignoring them later in the flow. TIR experts should be able to comment better.

1 Like