Hi,
I found that the output of deployment model using vulkan backend was wrong. The models I were using are from https://github.com/tianweiy/CenterPoint
in onnx
format.
GPU:
Fri Oct 28 06:05:09 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02 Driver Version: 510.85.02 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 0% 43C P8 18W / 290W | 1431MiB / 8192MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1170 G /usr/lib/xorg/Xorg 102MiB |
| 0 N/A N/A 3728 G /usr/lib/xorg/Xorg 558MiB |
| 0 N/A N/A 3858 G /usr/bin/gnome-shell 84MiB |
| 0 N/A N/A 5043 G ...AAAAAAAAA= --shared-files 73MiB |
| 0 N/A N/A 5323 G ...181820525847910195,131072 494MiB |
| 0 N/A N/A 7175 G ...AAAAAAAAA= --shared-files 103MiB |
+-----------------------------------------------------------------------------+
Vulkan:
==========
VULKANINFO
==========
Vulkan Instance Version: 1.2.135
Instance Extensions: count = 18
===============================
VK_EXT_acquire_xlib_display : extension revision 1
VK_EXT_debug_report : extension revision 10
VK_EXT_debug_utils : extension revision 2
VK_EXT_direct_mode_display : extension revision 1
VK_EXT_display_surface_counter : extension revision 1
VK_KHR_device_group_creation : extension revision 1
VK_KHR_display : extension revision 23
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2 : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2 : extension revision 1
VK_KHR_surface : extension revision 25
VK_KHR_surface_protected_capabilities : extension revision 1
VK_KHR_wayland_surface : extension revision 6
VK_KHR_xcb_surface : extension revision 6
VK_KHR_xlib_surface : extension revision 6
Layers: count = 6
=================
VK_LAYER_KHRONOS_validation (Khronos Validation Layer) Vulkan version 1.2.135, layer version 1:
Layer Extensions: count = 3
VK_EXT_debug_report : extension revision 9
VK_EXT_debug_utils : extension revision 1
VK_EXT_validation_features : extension revision 2
Devices: count = 1
GPU id = 0 (NVIDIA GeForce RTX 3070 Ti)
Layer-Device Extensions: count = 3
VK_EXT_debug_marker : extension revision 4
VK_EXT_tooling_info : extension revision 1
VK_EXT_validation_cache : extension revision 1
VK_LAYER_LUNARG_api_dump (LunarG API dump layer) Vulkan version 1.2.135, layer version 2:
Layer Extensions: count = 0
Devices: count = 1
GPU id = 0 (NVIDIA GeForce RTX 3070 Ti)
Layer-Device Extensions: count = 1
VK_EXT_tooling_info : extension revision 1
VK_LAYER_LUNARG_device_simulation (LunarG device simulation layer) Vulkan version 1.2.135, layer version 1:
Layer Extensions: count = 0
Devices: count = 1
GPU id = 0 (NVIDIA GeForce RTX 3070 Ti)
Layer-Device Extensions: count = 1
VK_EXT_tooling_info : extension revision 1
VK_LAYER_LUNARG_monitor (Execution Monitoring Layer) Vulkan version 1.2.135, layer version 1:
Layer Extensions: count = 0
Devices: count = 1
GPU id = 0 (NVIDIA GeForce RTX 3070 Ti)
Layer-Device Extensions: count = 1
VK_EXT_tooling_info : extension revision 1
VK_LAYER_LUNARG_screenshot (LunarG image capture layer) Vulkan version 1.2.135, layer version 1:
Layer Extensions: count = 0
Devices: count = 1
GPU id = 0 (NVIDIA GeForce RTX 3070 Ti)
Layer-Device Extensions: count = 1
VK_EXT_tooling_info : extension revision 1
VK_LAYER_LUNARG_vktrace (Vktrace tracing library) Vulkan version 1.2.135, layer version 1:
Layer Extensions: count = 0
Devices: count = 1
GPU id = 0 (NVIDIA GeForce RTX 3070 Ti)
Layer-Device Extensions: count = 0
Presentable Surfaces:
=====================
Device Groups:
==============
Group 0:
Properties:
physicalDevices: count = 1
NVIDIA GeForce RTX 3070 Ti (ID: 0)
subsetAllocation = 0
Present Capabilities:
NVIDIA GeForce RTX 3070 Ti (ID: 0):
Can present images from the following devices: count = 1
NVIDIA GeForce RTX 3070 Ti (ID: 0)
Present modes: count = 1
DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR
Device Properties and Extensions:
=================================
GPU0:
VkPhysicalDeviceProperties:
---------------------------
apiVersion = 4206786 (1.3.194)
driverVersion = 2140487808 (0x7f954080)
vendorID = 0x10de
deviceID = 0x2482
deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
deviceName = NVIDIA GeForce RTX 3070 Ti
The script I was using:
import onnx
import numpy as np
import onnxruntime as ort
import tvm.relay as relay
import tvm
from tvm.contrib import graph_executor
model_encoder = "/path_to_model/pts_voxel_encoder_centerpoint.onnx"
model_head = "/path_to_model/pts_backbone_neck_head_centerpoint.onnx"
onnx_encoder = onnx.load(model_encoder)
onnx_head = onnx.load(model_head)
x = np.ones((40000,32,9), dtype=np.float32)
# x = np.zeros((1,32,560,560), dtype=np.float32)
ort_sess = ort.InferenceSession(onnx_encoder.SerializeToString())
out_onnx = ort_sess.run(None, {'input_features': x})
target = "vulkan"
input_name = "input_features"
shape_dict = {input_name: x.shape}
mod, params = relay.frontend.from_onnx(onnx_encoder, shape_dict)
with tvm.transform.PassContext(opt_level=3):
lib = relay.build(mod, target=target, params=params)
dev = tvm.device(str(target), 0)
module = graph_executor.GraphModule(lib["default"](dev))
dtype = "float32"
module.set_input(input_name, x)
module.run()
output_shape = (40000, 1, 32)
tvm_output = module.get_output(0, tvm.nd.empty(output_shape)).numpy()
print(tvm_output.shape)
# for i in range(len(out_onnx)):
# idx = "output_" + str(0)
result = out_onnx[0] - tvm_output
print(result[np.where(result > 0.0001)])
print(len(result[np.where(result > 0.0001)]))
Please let me know if anyone knows how to make it work, or is it a bug? Thank you.