Hi, @areusch @manupa-arm @Mousius .
I am using microTVM to deploy a customized VGG model to STM32F746g_disco, this model uses tflite for int8 quantization. The model structure is as follows:
After generated_project.build()
, microTVM will generate the default_lib1.c file in the generated-project/model/codegen/host/src
directory, you can find that TVMBackendAllocWorkspace
alloc for a large memory(about 110KB) in the runtime phase, and the memory will only be released after inference is over. I have the following questions:
- The tflite model size is about 52KB, why mictoTVM need so large memory usage(about 110KB) for intermediate variable?
- Which part of the IR will determine the memory usage of the
TVMBackendAllocWorkspace
function? I want to reduce its memory usage. - What are the factors affect the size of memory allocation? Does the current microTVM consider memory reuse between operators?
- Can Unified Static Memory Planner(RFC 0009) reduce the runtime memory usage?
Finally, TVM seems to pay more attention to inference acceleration, and the optimization for memory is not perfect(but it could). I do hope microTVM can deploy models with better performence on bare-metal device.
Looking forward to your reply.