CodeGenCHost and CodeGenCBase and relation to internal and external compilers?

Hello all,

while trying to study the DNNL external compiler I came about the differences between how the external (DNNL) and the internal (target=‘c’) C codes are being generated:

Q1: Why does it seem to be two different “base” C codegen classes?

Q2: In general I have the question of what exactly does the CodegenCHost imply, that the CodegenDNLL could not simply derive from it? So in other words, when is it wrong to derive an external compiler from the CodegenCHost?

  • Notices that some of the includes in the CodegenDNLL are actually also included in the CodegenCHost

Q3: Why is the CsourceModuleCreate in the CodeGenCHost only given 2 arguments?

Thanks

@comaniac @lhutton1 - maybe you could help ?

For Q1, sorry for the confusion. CodegenCBase does not relate to CodeGenC at all. It is the base class we created for BYOC.

For Q2, CodegenDNNL does not have to worry about the C host as it only processes subgraphs.

For Q3, CsourceModuleCreate takes 4 arguments while the last two are optional.

@zhiics may commet more on details.

Thanks for the comments

I looked further into the code and this is what I came up with:

The CodeGenC (which is the base class for the CodeGenCHost) also deals with subgraphs. The main difference (AFAIK) between both types of subgraphs is that for DNNL they are in Relay, while the other are in TIR. This is because they have gone the Relay -> Topi -> TE -> TIR process, which is the standard lowering process. BYOC is an alternative way to that. Therefore a BYOC C generator should not be derived from CodeGenC since its inputs are not in TIR format (its routines for visiting nodes expect TIR nodes).

Yeah I was aware that the last two were optional, but I was wondering why the CodeGenCHost doesnt set the other variables. I have a hypothesis, but unsure how correct it is since it only concerns the third argument symbol:

When generating C code with the CodeGenCHost, the process collects all operators beforehand and then bundles them up in one call to CodeGenCHost. This means that the runtime::Module can have the ''symbol since it is assumed that there will not be a second module with that name. In BYOC, and therefore in the DNNL example, each subgraph is sent to CodegenDNNL separately. Generating n runtime:Modules. If all are given the same symbol, the compiler will throw an error about name collision.

What my previous insight does not respond is:

  • Why, in the CodeGenCHost example, is symbol=''? How does the runtime know that the module where all routines are found is called '' and not any other string?
    • In the BYOC example it is given the Relay composite function name which was partitioned from the original Relay program. Which makes sense because the Relay program calls that function and therefore having the same symbol makes it a simple mapping.
    • Based on that logic I would expect it to have the default name 'main'
1 Like