microTVM mlperf tiny input data

Hi,

I am working through the gallery/how_to/work_with_microtvm/micro_aot.py tutorial in order to get to know how to run microTVM with mlperf tiny. I am running the inference on the host machine, i.e. no micro controller involved. The example works as described, but I am struggling with preparing the input data for different examples. The input data are presented to TVM as numpy arrays, but I do not know how to convert the input data into the correct format. E.g. The KWS dataset provided by Google consists of wav files, which are converted to nested int8 numpy arrays of shape (1, 49, 10, 1).

How do I make this conversion? And what does this shape represent?

In order to get some understanding, I tried to convert the given npy files to wav files, but with no success. I am aware of the repo github.com/tlc-pack/web-data which contains prepared input for the tests/micro/common/test_mlperftiny.py example, but they did not provide a script or documentation how they did this preparaion.

Thank you and best regards, Benedikt

This is just a follow up. In the mlperf tiny repo in benchmark/training/keyword_spotting there is a python script make_bin_files.py that can be adapted to prepare the input for the models found in the mlperf tiny kws benchmark. Since the preparation applies some filters on the input wav files, a direct conversion back from the npy arrays to the wav files is not possible.

I managed to get successful inference runs using microTVM with kws in the quantized (int8) version. However in the float32 version, the generated aot project does not classify the samples correctly. What could be the reason for that?

With the visual wake words VWW benchmark from mlperf tiny, I have tried the same. In this case, both the int8 version and the float32 version work well with microTVM. Also converting the input back to the original photos is also possible, because no filters are applied.

Best regards, Benedikt