[BYOC] What is the main difference between te.extern and BYOC?

mgeek · November 7, 2022, 2:15am

Hi, I am currently learning the usage of BYOC and experimented with some cases like incorporating the ArmCompute Library with BYOC. Before this, I also tried some “call external function” cases with nnpack via te.extern.

My question is, I know BYOC is able to partition the graph and offload sub-graph level components completely to third-party runtime libraries, but for single operator cases, what is the difference between using BYOC and te.extern?

Although we might lose the ability to merge several operators as a sub-graph composite, to me it seems like te.extern can also do the offloading/call-external function thingy when it comes to single operators, and it can be recognized as an implementation thus can be used with autoTVM.

Thanks in advance!

@comaniac @tqchen @junrushao @lhutton1

masahi · November 7, 2022, 5:50am

You are right, for a single op offloading, te.extern and BYOC are practically not that different. Of course, the compilation mechanisms are drastically different - te.extern based offloading would require more invasive changes (topi, op strategy etc).

mgeek · November 7, 2022, 7:24am

Thanks for the help~ This clarifies the whole picture a lot!

Btw, can you elaborate more on what you mean by invasive changes brought by te.extern? From my understanding, we can’t change the inherent schedule for external functions from te.extern as well, so basically, we just have to add it as a plain candidate implementation for a certain operator in strategy right?

masahi · November 7, 2022, 8:40am

Yes this is exactly right.

By invasive, I meant that the te.extern based approach needs to modify things across the stack, like topi and op strategy, while BYOC is more localized. See and compare how cuDNN (te.extern) and CUTLASS (BYOC) support are implemented.

lhutton1 · November 7, 2022, 2:00pm

Just wanting to point to an additional resource that lets you customise the lowering pipeline for your integration called Target Hooks: [pre-RFC] Additional Target Hooks. I believe this helps unlock the benefits of both mechanisms