[BYOC][TRT]Why don't use async interface 'enqueueV2'?

DzAvril · April 8, 2022, 7:45am

#if TRT_VERSION_GE(6, 0, 1)
    if (use_implicit_batch_) {
      ICHECK(context->execute(batch_size, bindings.data())) << "Running TensorRT failed.";
    } else {
      ICHECK(context->executeV2(bindings.data())) << "Running TensorRT failed.";
    }
#else
    ICHECK(context->execute(batch_size, bindings.data())) << "Running TensorRT failed.";
#endif

TVM-TRT uses the sync interface executeV2 to run inference. Async version enqueueV2 supposes to be faster. I guess there is a gap preventing TVM-TRT from using the async version. Can anybody give a hint why don’t use the async interface? Friendly ping @trevor-m