First of all, I’m not sure what path this discussion should take, i.e. whether RFC needs to follow or not. This post is to present the idea and get some initial feedback.
Problem
LLVM maintains global state, and that global state can have an impact on the behavior of LLVM functions.
A specific example of that are various flags, which a clang user can pass to LLVM via -mllvm
option. For example -mllvm -unroll-threshold=100
would set the threshold for loop unrolling to 100. Once that’s set, however, it remains in place, even when generating code for a different target. Since TVM can generate code for multiple targets all in the same compilation, this can become an issue. [Note: the -mllvm
option is a clang option, what follows is the LLVM option. Naturally, there are ways to apply these LLVM options without clang.]
People working on individual targets may use the -mllvm
flags to fine-tune the LLVM codegen to their needs, or as workarounds for LLVM bugs, but these flags will only be applicable to that target. However, these options will remain effective “forever”, moreover some such options can only be specified once, leading to an error (abort) in LLVM when they are applied for the second time.
Solution
To solve this, we need a mechanism to “reset” the state of global variables in LLVM back to the original state. The only mechanism that allows that (that I am aware of) is via loading/unloading shared libraries. I propose to isolate the LLVM code generation into its own shared library. This library would be loaded (dlopen) when an object code needs to be generated, and unloaded (dlclose) afterwards.
The JIT functionality would be accomplished by separating the codegen step from the execution step: the codegen library would generate an object file, which would then be loaded via a dynamic loader mechanism. This is actually what already happens anyway, except it happens inside of the ExecutionEngine
, in the proposal the two steps would be separated.
What are everyone’s thoughts about this?