Motivation
As part of the Standalone µTVM Roadmap, the TVM RPC server is being implemented on bare metal devices. The overall approach is to link MinRPCServer against the MISRA-C runtime plus additional compiled TVM functions. There are several new features that need to be added to the MISRA-C runtime in order to service all of the RPC requests.
-
TVMFuncGetGlobal
andTVMModGetFunction
are currently implemented as hardcodedif-else
blocks. Some scalable, programmatic solution is needed to interact with both generated TVMModules and PackedCFuncs from third-party code. - Since Modules are typically instantiated in the C++ runtime using
dlpack
, some strategy for Module instantiation is needed in the MISRA-C runtime. - Unit tests coverage should be improved, and black-box tests need to be written, since the MISRA- C runtime tests will eventually be used as an acceptance test for µTVM devices. This RFC doesn’t specifically address testing, but does address some improvements needed to enable tests to be written:
- There is no standalone build of the MISRA-C runtime currently checked-in; so, it’s possible that the MISRA-C runtime could depend on C++ code.
- There isn’t a way to write tests against the MISRA-C runtime without linking it to
libtvm.so
.
- Some C++ features are being used, such as
exit()
, which may need to be extended in an embedded setting. Also, a logging solution needs to be devised to ease debugging.
Changes to the MISRA-C Runtime
FuncRegistry
Function lookup by string name is a common RPC task. Currently, the MISRA-C runtime implements function lookup using an if-else
statement tree. This RFC introduces a new type, TVMFuncRegistry
, which can be used for both global and module functions.
/*!
* \brief A data structure that facilitates function lookup by C-string name.
*/
typedef struct TVMFuncRegistry {
/*! \brief Names of registered functions, concatenated together and separated by \0.
* An additional \0 is present at the end of the concatenated blob to mark the end.
*
* Byte 0 is the number of functions in `funcs`.
*/
const char* names;
/*! \brief Function pointers, in the same order as their names in `names`. */
const TVMBackendPackedCFunc* funcs;
} TVMFuncRegistry;
The design constraints were:
- Assume a small number of functions (i.e. < 30)
- All lookup functions should work on a
const
instance of the Registry so it can be placed in flash. - Easy to generate in TVM codegen.
The names
field is a uint8_t
count followed by the C-string names of functions concatenated together. A terminating \0
marks the end of the names
list, so when count > 0
, names
always ends in a double \0\0
.
Byte: 0 1 2 3 4 5 6 7 8 9 a b c d
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
names: | N | F u n c 0 \0 | F u n c 1 \0| \0|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Legend:
N - `count` field
Func0 - 0th registered function name
Func1 - 1st registered function name
Function lookup is done by linearly scanning the concatenated names until a match is found. Function handles are encoded as the 0-based index of the matching name in the FuncRegistry.
From the index, the function pointer can be retrieved in constant time. The index is validated using the count
field and the pointer is returned by indexing funcs
.
MutableFuncRegistry
Unlike the SystemLib Module, the global function namespace can change at runtime. MutableFuncRegistry
is a RAM-based implementation of FuncRegistry
that adds a Set
function.
/*!
* \brief A TVMFuncRegistry that supports adding and changing the functions.
*/
typedef struct TVMMutableFuncRegistry {
TVMFuncRegistry registry;
/*! \brief maximum number of functions in this registry. */
size_t max_functions;
} TVMMutableFuncRegistry;
Here, registry
is presumed to be mutable, so Set
just directly modifies names
and funcs
. The create function accepts a single block of memory and partitions it into names
and the rest of the data structure. It computes a capacity based on an average function name length and stores that in max_functions
.
Modules
CRT Modules are implemented as structs whose first member is a TVMModule
:
/*!
* \brief Module container of TVM.
*/
typedef struct TVMModule {
/*! \brief The function registry associated with this module. */
const TVMFuncRegistry* registry;
} TVMModule;
Modules can be registered with the CRT (though a public-facing function is yet TBD). Upon registration, a pointer is placed in a global modules array, and the module is assigned id equal to the index in this array.
Note that TVMBackendRegisterSystemLibSymbol
will not be implemented in the CRT C implementation. SystemLibs generated for this implementation are expected to contain a TVMModule instance and implement the initialization discussed under Life Cycle.
TVMFunctionHandle
TVMFunctionHandles are now composed of two pieces:
31 30 bits 16 15 bits 0
+---------------------------+---------------------------+
| V | Module Index | Function Index |
+---------------------------+---------------------------+
Legend
V: Module Index Valid bit (1 = valid, 0 = invalid; global function)
TVMFuncCall
TVMFuncCall is used to satisfy the kCallFunc
RPC code. No TVMFuncCall
implementation exists in the MISRA-C runtime. One goal in implementing it is to do something such that C code could be called from either the C++ implementation or the MISRA-C runtime without modification. This section describes the existing codebase to motivate some refactoring.
At present, TVMFuncCall
invokes C++ functions using the CallPacked
:
void PackedFunc::CallPacked(TVMArgs args, TVMRetValue* rv);
When calling C functions, it’s customary to provide a void*
, termed in TVM as resource_handle
. The C++ implementation of the TVM C Runtime API provides this using a closure glue function:
int TVMFuncCreateFromCFunc(TVMPackedCFunc func, void* resource_handle, TVMPackedCFuncFinalizer fin,
TVMFunctionHandle* out) {
// ...
*out = new PackedFunc([func, resource_handle](TVMArgs args, TVMRetValue* rv) {
int ret = func((TVMValue*)args.values, (int*)args.type_codes, // NOLINT(*)
args.num_args, rv, resource_handle);
if (ret != 0) {
throw dmlc::Error(TVMGetLastError() + ::dmlc::StackTrace());
}
});
// ...
}
This glue calls functions with signature:
/*!
* \\brief C type of packed function.
*
* \\param args The arguments
* \\param type_codes The type codes of the arguments
* \\param num_args Number of arguments.
* \\param ret The return value handle.
* \\param resource_handle The handle additional resouce handle from fron-end.
* \\return 0 if success, -1 if failure happens, set error via TVMAPISetLastError.
* \\sa TVMCFuncSetReturn
*/
typedef int (*TVMPackedCFunc)(TVMValue* args, int* type_codes, int num_args, TVMRetValueHandle ret,
void* resource_handle);
Meanwhile, generated TVM functions use yet another type signature:
/*!
* \\brief Signature for backend functions exported as DLL.
*
* \\param args The arguments
* \\param type_codes The type codes of the arguments
* \\param num_args Number of arguments.
* \\param out_ret_value The output value of the the return value.
* \\param out_ret_tcode The output type code of the return value.
*
* \\return 0 if success, -1 if failure happens, set error via TVMAPISetLastError.
*/
typedef int (*TVMBackendPackedCFunc)(TVMValue* args, int* type_codes, int num_args,
TVMValue* out_ret_value, int* out_ret_tcode);
First, on devices with a limited amount of non-virtual memory, it’s desirable to avoid allocations. Here, it would be good to eliminate the closure.
Second, In order to both invoke generated functions and C libraries compatible with full TVM, the µTVM RPC server needs to call both TVMBackendPackedCFunc
and TVMPackedCFunc
. There are two differences between the two:
-
TVMPackedCFunc
places theout_ret_value
andout_ret_code
into a single C structTVMRetValue
. The purpose of this struct is to make it easier to invokeTVMCFuncSetReturn
, whose purpose is to ease the process of transferring managed objects from C to C++. -
TVMBackendPackedCFunc
lacks aresource_handle
parameter.
For Discussion
There are a couple of approaches to handling the discrepancy in APIs here:
A1. Decide that each FuncRegistry
can store functions of only one type, and store a boolean indicating the type.
A2. Store this flag per-function.
A3. Resolve the differences in API and delete one of the types.
A4. Add support for calling TVMBackendPackedCFunc
from the C++ runtime, and standardize on only calling TVMBackendPackedCFunc
from this MISRA-C runtime.
There are also some options for handling resource_handle
:
O1. Store a has_resource_handle
bit plus associated resource_handle
for each function in a TVMFuncRegistry
.
O2. Add resource_handle
to TVMBackendPackedCFunc, and decide that, as a MISRA-C constraint, resource_handle
is:
- when invoking a module function: set to the
TVMModule*
that contains the function - set to
NULL
when invoking a global function.
In the long term, I think we should adopt approach A4 and option O2. This would mean:
- The function signature for C functions, including generated TVM functions, will be:
typedef int (*TVMBackendPackedCFunc)(TVMValue* args, int* type_codes, int num_args,
TVMValue* out_ret_value, int* out_ret_tcode,
void* resource_handle);
-
TVMCFuncSetReturn
will not be implemented for the MISRA-C runtime. -
resource_handle
will be passed as in O2. - Codegen for TVM functions will be changed to include the 6th runtime parameter. Some platforms may not technically need any modification, because the argument passing strategy will ignore extra arguments at runtime.
CRT Life Cycle
In the C++ implementation of the CRT, much of the initialization process is accomplished by global initializers. For runtimes targeting bare-metal devices, this is less desirable for a couple of reasons:
- Often SoCs reset to a slow CPU frequency in a high power configuration. Running initialization code as soon as possible improves system startup performance and thereby power draw.
- Where control loops are involved, the system startup time often bounds the maximum jitter seen in the control loop. RTOS or interrupt-driven control are typically employed to address this problem, and need to be ultimately in control of the system while user code (i.e. µTVM) is executing.
- The memory subsystem is initialized by user code, and that may not have happened when global initializers are running.
For that reason, the C implementation defines an explicit initializer for the runtime:
/*! \brief Initialize internal runtime components.
*
* Call this function before calling any other functions in the runtime.
* \\return 0 when success, negative integer when error occurs.
*/
TVM_DLL int TVMInitializeRuntime(void);
Additionally, an explicit initializer is defined for the SystemLib:
/*! \brief Entry point for the system lib module. */
const TVMModule* TVMSystemLibEntryPoint(void);
This function is called when runtime.SystemLib
is invoked for the first time.
Module Initialization
Modules are currently registered using an internal function:
int _TVMModCreateFromCModule(const TVMModule* mod, TVMModuleHandle* out_handle);
Client code can get a TVMModuleHandle
by invoking a PackedFunc
which returns one. Currently, the only such PackedFunc
exposed as part of the CRT is runtime.SystemLib
. C code can register additional such functions using TVMFuncRegisterGlobal
. This will probably be exposed as a public CRT API in a future PR, but not part of this RFC now.
Isolated CRT Build
At present, the CRT is built by #include
ing a series of .c
files into bundle_deploy
. There isn’t anything else currently at master
enforcing a separation between CRT and TVM C++ runtime, particularly for unit testing. The C-based framing and session layers introduced below are intended to be linked into both the CRT and the TVM C++ runtime, so it would help to have better assurance that CRT components truly don’t rely on any C++ headers or linked pieces. Also, it will be difficult (at least initially) to run unit tests for the framing and session layers on microcontrollers, so linking those against just the CRT improves confidence in their portability.
For all of those reasons, the standalone CRT will be copied to a directory in build
and compiled from there using a make
-based build without the traditional include paths. The make
-based build is also useful as a starting point to integrate the CRT with custom firmware.
crt_config.h
In building the CRT as a standalone library (see Isolated CRT Build above), some parameters need to be fixed at compile time. A new file, crt_config.h
, will #define
those parameters. Each time the CRT is compiled for a particular target environment, this file needs to be provided to the make
-based build system.
Initially, crt_config.h
will contain the data from apps/bundle_deploy/runtime.c
with changes and similar parameters for TVMFuncRegistry
support:
/*! Support low-level debugging in MISRA-C runtime */
#define TVM_CRT_DEBUG 0
/*! Maximum supported dimension in NDArray */
#define TVM_CRT_MAX_NDIM 6
/*! Maximum supported arguments in generated functions */
#define TVM_CRT_MAX_ARGS 10
/*! Maximum supported string length in dltype, e.g. "int8", "int16", "float32" */
#define TVM_CRT_STRLEN_DLTYPE 10
/*! Maximum supported string length in function names */
#define TVM_CRT_STRLEN_NAME 80
/*!
* \brief Log memory pool size for virtual memory allocation
*
* Here is a list of possible choices:
* * use 16 for 64 KiB memory space
* * use 17 for 128 KiB memory space
* * use 18 for 256 KiB memory space
* * use 19 for 512 KiB memory space
* * use 20 for 1 MiB memory space
* * use 21 for 2 MiB memory space
* * use 22 for 4 MiB memory space
* * use 23 for 8 MiB memory space
* * use 24 for 16 MiB memory space
* * use 25 for 32 MiB memory space
* * use 26 for 64 MiB memory space
* * use 27 for 128 MiB memory space
* * use 28 for 256 MiB memory space
*/
#define TVM_CRT_LOG_VIRT_MEM_SIZE 24
/*! \\brief Page size for virtual memory allocation */
#define TVM_CRT_PAGE_BYTES 4096
Testing
Unit tests can be written against the isolated CRT build without linking parts of the TVM runtime. Integration tests will be done using the µTVM RPC Server, and discussed further after that lands.
TVMPlatform
functions
A new function prefix will be introduced, TVMPlatform
, which contains functions that vary by platform. These are not provided by CRT and should be implemented by each platform. In this RFC, only the following function is added:
-
void __attribute__((noreturn)) TVMPlatformAbort(int error_code)
- called when a CHECK failure occurs