Does TVM support multiple streams on gpu?

cheng · August 28, 2024, 2:09pm

Hi, When my target is set to CUDA, I found that the computational graph runs in a single stream format. Can multiple streams be generated under TVM?

The example used is ResNet-50 v2 mentioned in the document.

Does anyone know the answer? Thanks.

LeiWang1999 · August 31, 2024, 5:14am

cheng · August 31, 2024, 7:53am

Thank you for your answer.@LeiWang1999

I understand that CUDA’s capture is not intended for OP parallelism. If I want to achieve parallelism between OP, does TVM provide any methods?

I found that some ops in ResNet-50 can be parallelized, but the final generated code is serial? I don’t understand the purpose of doing this.