Does TVM support multiple streams on gpu?

Hi, When my target is set to CUDA, I found that the computational graph runs in a single stream format. Can multiple streams be generated under TVM?

The example used is ResNet-50 v2 mentioned in the document.

Does anyone know the answer? Thanks.

hi @cheng , you might want to try using CUDA Graph directly? tvm/tests/python/runtime/test_runtime_module_based_interface.py at main · apache/tvm (github.com)

Thank you for your answer.@LeiWang1999

I understand that CUDA’s capture is not intended for OP parallelism. If I want to achieve parallelism between OP, does TVM provide any methods?

I found that some ops in ResNet-50 can be parallelized, but the final generated code is serial? I don’t understand the purpose of doing this.