[RFC][BYOC] Android NNAPI Integration

Thanks for the clarification, so the codegen has to generate and compile C++ code to be a shared library; while runtime needs to construct a model graph (or engine). It seems clear to me, and we could discuss the implementation detail about when and where to invoke clang++ in the PR.

For testing the partition, we definitely cannot directly test profiling-based partition in the CI. Instead, it is reasonable to mock the profiling APIs with hard-coded latencies, so no RPC will be involved in testing.

For testing the converter, I guess it might be fine if the referred C++ code is not that long so that you can hard code it to the unit test. Accordingly, I suggest the following unit tests:

  1. When testing a small graphs with 1-2 ops, we have a hard coded C++ code string to be compared with the generated C++ code. After the code is matched, we try to compile it to see if there has any issue.
  2. When testing a small graphs (e.g., the whole network), we only try to compile the generated C++ code without test-based checking.

My point is, if the expected C++ code are hard coded in the unit test, anyone could dive into and fix the test once it fails. In this way, we guarantee the maintenance.