Thank you, I will try the C interface, it’s true that manually call cuda kernel can’t handle the multiple cuda kernels.