Does VM support 'cuda' target now?Compilation done, but got error while running

I am currently trying to transfer a model that uses MOE to tvm for optimization. MOE involves some dynamic tensors. My target is cuda.

I successfully sent the model to Ansor and metaScheduler for optimization, and successfully compiled it using VM.

However, when I execute ’.run‘ after I set the input, I get an error:

Check failed: ret == 0 (-1 vs. 0) : TVMError: CUDALaunch Error: CUDA_ERROR_INVALID_VALUE grid=(1,100352,1), block=(1024,1,1) // func_name=vm_mod_fused_argsort_kernel0

The document I follow is: how_to/deploy_models/deploy_object_detection_pytorch

I noticed that it has a note saying: “Currently only CPU target is supported.”

I’m a little confused about this. I think tvm’s VM is based on numble, and numble seems to support cuda. In addition, in another discussion, haichen also mentioned that “we can use VM runtime on GPU without performance issue”.

So, I’m not sure if the error I get is because my model uses an unsupported operator, or the cuda target is not supported by the VM.

@haichen

Looking forward to your answer :pleading_face:

Thank you very much.

Sorry, I found the problem.

vm can be compiled for cuda target.

I got this error because when I operated tensor in pytorch, I flattened dim 0 and 1 of the input. When tvm is running, it starts the thread according to the shape of dim0. This value used to be the batch_size, but now it becomes a very large value, which leads to an error.

Besides, the document I follow might need to update.