[pre-RFC] New CI container: ci_cpu_asserts

Hi all,

I wanted to discuss the idea of adding a new CI container: ci_cpu_asserts. This container would behave just like ci_cpu, except that both TVM and various libraries within the container would be compiled with additional runtime checks enabled. For example:

  • TVM could be configured with -DUSE_RELAY_DEBUG
  • LLVM could be configured with -DUSE_LLVM_ASSERTIONS
  • we could consider other runtime checkers such as AddressSanitizer

In CI, we would build TVM and run tests/python/task_python_unittest.sh against it. There are a couple of scenarios this would benefit:

  • PR 9842 proposes to add a concurrent-update check to tvm’s custom Map impl, but due to a limitation in LLVM, we cannot compile with NDEBUG defined. Therefore, we would like to compile with USE_RELAY_DEBUG, but that could slow down integration tests in the CI.
  • Our LLVM backends are not really tested as well as they should be–LLVM assertions performs proper verification of the emitted IR. All 2 major changes I’ve made (link-params, AOT LLVM backend) have had problems only found with -DLLVM_ENABLE_ASSERTIONS (in some cases, these were minor; in others, they would cause functional correctness problems).
  • There is no home for this checking now, meaning that it’s controversial to add it since it slows down CI.

Perhaps we can discuss this at the TVM Community Meeting on Wednesday.

cc @kparzysz @dmitriy-arm @driazati

Andrew

Full support from me! This shouldn’t affect CI runtime too much with build caching + test sharding we can get the runtime down to something similar to existing stages (so runtime will be hidden via parallelism). Also I don’t know if it’s implied by USE_RELAY_DEBUG but we should also build it with -DCMAKE_BUILD_TYPE=Debug since we’ve had some breakages of the debug build in the past that slipped through CI. I’d also propose we build with the clang associated with whatever LLVM version we use just to increase CI-tested compilers

1 Like

We could have a daily build of the main branch of LLVM. We could build clang, not just the LLVM codegen libraries.

1 Like

let’s discuss this a bit more Wednesday and then I’ll wrap it up into an RFC and we can go forward from there

Notes from the TVM community meeting, 13-Apr. (Agenda doc is Apache TVM Weekly Community Meeting - Google Docs)

Andrew: Motivation was to add small-but-non-trivial checking overhead to Map. How can we get the benefit without forcing onto all runtime? NDEBUG is difficult due to LLVM issues, so USE_RELAY_DEBUG. Currently never enabled in any CI build. Would also use to enable LLVM checker on constructed IR to catch ill-formed constructions (which is very easy to trigger). In the future could use asan.

Mark: Policy on ICHECK vs DCHECK? Issue is DCHECK’s can go stale, so depend on regular USE_RELAY_DEBUG runs. Let’s do that and pay down any existing debt, then can migrate ICHECKS to DCHECKS where they are expensive.

Michael: Use LLVM debug build in USE_RELAY_DEBUG mode? Currently using standard Ubuntu build. Hoping to stay in standard x86 targets, so should be ok to build from LLVM HEAD.

Leandro: Used to have to build LLVM from source, so can help point to ‘standard’ flags. Should standardize on versions & config. Andrew: We tend to be very conservative on LLVM version.

Mark: Some tests or all tests? Ideally choose existing split point (eg unit vs python tests).

Mark: Could use this to enable post-Pass self-invariant checks.

Mark: Can we turn on stdlib DEBUG asserts? Not sure how, but seems like a good idea.

Overall very enthusiastic, thank you Andrew!

1 Like

@kparzysz @areusch FWIW, we already have a LLVM builder script being used in tlcpack, It was originally written by @haichen.

We could try to reuse this in TVM, with a few change to make it similar to the other scripts we have.

A lot of what that script does has to do with dealing with various versions of LLVM source distributions. For our purposes none of that is necessary. Building LLVM is really simple, it’s

  1. git clone (for building the latest main branch.
  2. cmake ...
  3. make all install

I think the main question here is what to pass in ... given to cmake. Any thoughts on which flags we may want for this build?

The minimum set of flags would be

-DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=/where/to/install
-DLLVM_ENABLE_ASSERTIONS:BOOL=ON
-DLLVM_ENABLE_PROJECTS='llvm'

Edit: This will build LLVM only, i.e. the headers and codegen libraries. If we want to build a compiler that can compile C/C++ code, we’d need to do something else (there are scripts to build an entire release).

@areusch , @mbs-octoml

1 Like

Thanks @cbalint13, that’s going into by config.cmake right now.