[BYOC] limited function numbers on crt runtime

A1245967 · January 19, 2022, 10:55am

Hi !
I am using BYOC flow and crt runtime to verify some operators by our custom dense function.
However, I try to import the RNN (GRU) operator with sequence length 60.
Our codgen will generate more than 256 functions, and I encounter the following error.

invalid function index: 01d4

I found the error happens in the function “TVMFuncRegistry_GetByIndex” which is in the file “tvm/src/runtime/crt/common/func_registry.c”.

tvm_crt_error_t TVMFuncRegistry_GetByIndex(const TVMFuncRegistry* reg,
                                           tvm_function_index_t function_index,
                                           TVMBackendPackedCFunc* out_func) {
  uint8_t num_funcs;

  num_funcs = reg->names[0];
  if (function_index >= num_funcs) {
    return kTvmErrorFunctionIndexInvalid;
  }

  *out_func = reg->funcs[function_index];
  return kTvmErrorNoError;
}

The number of the functions is stored at a character which can only store 256 functions.
How do I solve this problem?
Thanks!

comaniac · January 19, 2022, 6:01pm

That’s an interesting point. I found that our function index is uint16_t so we should be able to support 65536 functions. However, the function number is stored in the first element of reg->names, which is a char array with only 8 bits for each element, as uint8_t num_funcs indicates. So we in fact can only support 256 functions.

The solution should be separating function number to a standalone field in TVMFuncRegistry and make its type align with tvm_function_index_t .

cc @areusch

areusch · January 20, 2022, 4:03pm

@A1245967 thanks for the report! this is definitely an oversight. @comaniac has the correct suggested fix.

@A1245967 would you be up for submitting a PR? we can also create a GH issue as well to track work on this as we have bandwidth.

I just wanted to note one thing here–the reason I used a char was to ensure that struct alignment doesn’t waste a couple bytes. since we emit this struct in binary using the LLVM codegen, it’s a bit more complex to fix this for end users than merely passing -fpack-struct at compile time, because the LLVM codegen would have to know that the downstream firmware expected this. therefore I’d suggest we cast the first two bytes to uint16_t to make this check rather than split out the field.

A1245967 · January 21, 2022, 2:49pm

Hi, @areusch.
I can run my RNN model after I cast the first two bytes of string to uint16_t.
I have created a GH issue and submitted the PR to GH.

Thanks!