Duplication of the driver between C++ and Python

mbaret · April 14, 2021, 10:54am

I’ve been looking into the TVM lower/build pipeline recently and have encountered an unusual duplication around the ‘driver’. In particular, we have two files src/driver/driver_api.cc and python/tvm/driver/build_module.py which both seem to independently define almost identical functionality. What has motivated this design rather than using the ffi to just expose the C++ functionality through Python? I’m interested in making some changes in this part of the code so it would be good to know whether I have to make identical changes in both files.

Additionally, I’m seeing a strange ‘mix-and-match’ between these two files when invoking relay.build. In particular, we seem to use the Python version of lower but the C++ version of build. I can find no obvious motivation for this behaviour in the code.

Any insights would be greatly appreciated!

@manupa-arm @tqchen

manupa-arm · April 14, 2021, 11:13am

Yes, this part had been a pain point in figuring out which part of the compilation pipeline is being run.

Regarding, the lower, I think C++ version is not run (maybe not anywhere in the tvm compilation – correct me if I am wrong) at the minute because there is a check for the registered python ffi of lower(…).

Moreover, I found out that some relay passes (e.g. foldconstants) rely on the interpreter (which runs the JIT() of compile engine) to build and execute part of the artifacts in the process of relay optimization pipeline. What is interesting is that JIT() of compile engine uses the driver/build_module.py’s build though subsequently relay.build(…) will invoke the C++ version of build in driver_api.cc later on.

Do we have a reason for the existence of both ? Moreover, is there an assumption in the design these two build(IRModule, TE) → runtime.Module functionality to be kept implicitly identical ?

also cc : @csullivan @jroesch

jroesch · April 17, 2021, 5:13am

So I also found this a few months ago when I was working on the compile engine, my intention was to consolidate all of this code after I land initial compile engine refactor. There is a quite a lot of duplicated code around constructing the passes and running the compiler flow. I tried to actually consolidate everything to use the C++ code path in the compile engine path last week but had to restore nearly all the code as it was causing CI failures. I will try and land the compile engine refactor in the next couple days (still debugging test case failures) and then my intent was to first pull the passes into a single location, then rewrite both APIs to use them, then go back and delete the extraneous APIs just leaving the C++ ones.

To answer your other question I think this is just a historical accident as at some point the APIs were layered and I think Tianqi tried to pull most of the functionality into C++ but we didn’t make the full cutover. I think we should just clean up all this code so there is only one path through the compiler for each executor.