Thanks for the clarification, so the codegen has to generate and compile C++ code to be a shared library; while runtime needs to construct a model graph (or engine). It seems clear to me, and we could discuss the implementation detail about when and where to invoke clang++
in the PR.
For testing the partition, we definitely cannot directly test profiling-based partition in the CI. Instead, it is reasonable to mock the profiling APIs with hard-coded latencies, so no RPC will be involved in testing.
For testing the converter, I guess it might be fine if the referred C++ code is not that long so that you can hard code it to the unit test. Accordingly, I suggest the following unit tests:
- When testing a small graphs with 1-2 ops, we have a hard coded C++ code string to be compared with the generated C++ code. After the code is matched, we try to compile it to see if there has any issue.
- When testing a small graphs (e.g., the whole network), we only try to compile the generated C++ code without test-based checking.
My point is, if the expected C++ code are hard coded in the unit test, anyone could dive into and fix the test once it fails. In this way, we guarantee the maintenance.