Does TVM support inference on multi-gpu for very large model like GPT3 or chatGPT in the future?
Also I wonder how tvm use multi-gpus in tuning?Does it just dispatch different parameter settings on different gpu and measure the performance?
also interested in this topic