CPU to GPU is so slow on OpenCL and Mali

When feeding data via module.set_input , inference time is 4fps, without it it reaches over 100fps!

Example

target = tvm.target.mali(model="rk3588")
loaded_lib = tvm.runtime.load_module("lpr_model_0910_autotvm.tar")
dev = tvm.device(str(target), 0)
module = graph_executor.GraphModule(lib["default"](dev))

this gives over 100fps :

import time

tic = time.time()
for i in range(100):
    module.run()
fps = 100 / (time.time() - tic)
print(f"FPS: {fps:.2f}")

adding set_input, slowes it down to 4.67fps!

import time

tic = time.time()
for i in range(100):
    module.set_input(input_name, tvm.nd.array(img_data))
    module.run()
fps = 100 / (time.time() - tic)
print(f"FPS: {fps:.2f}")

img_data is just normalized image

import cv2

def load_image(image_path="./lp.jpg"):
    image = cv2.imread(image_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image = cv2.resize(image, (94, 24))
    return image

img_data = load_image().astype(np.float32)
img_data /= 255.0
img_data = img_data.transpose((2, 0, 1))
img_data = img_data[np.newaxis, :]

how can i make it faster? is really feeding mali gpu taking so long?

1 Like