NVIDIA Triton Inference Server(aka TIS) is high performance inference server which contributed by NVIDIA. It would be great if TVM can be a backend in TIS like onnxtime.
TIS can help TVM for model mangement, scheduling and so on.
NVIDIA Triton Inference Server(aka TIS) is high performance inference server which contributed by NVIDIA. It would be great if TVM can be a backend in TIS like onnxtime.
TIS can help TVM for model mangement, scheduling and so on.
Do you see any existing plans now?