Many time, when I use relax.build() API to deploy a AI Model, Error occur in tvm::codegen::CodeGenLLVM::Verify() after tvm::codegen::CodeGenLLVM::Finish().
there are generally two type errors in LLVM Side:
a):operator are not of the same type!
E.G:
Both operands to ICmp instruction are not of the same type!
%109 = icmp eq i32 %1, i64 1
Both operands to ICmp instruction are not of the same type!
%119 = icmp eq i32 %1, i64 1
b):Found return instr that returns non-void in Function of void return type!
E.G:
Found return instr that returns non-void in Function of void return type!
ret i32 %12
voidFound return instr that returns non-void in Function of void return type!
ret i32 %74
voidFound return instr that returns non-void in Function of void return type!
ret i32 %9
void
in my limited knowledge, we could transform the Primfunc to Legal Form before we download it to llvm::Funtion.
Yes, I had found some kind Pass in tir/transform, such as tir.transform.NarrowDataType(), tir.transform.ForceNarrowIndexToInt32(), etc, but which have’t nicely solved my diffculty.
I’m trying to reproduce the UT in Nvidia-A10 GPU, however, in latest tvm/unity branch, the relax front hav’t support the NonMaxSuppression Op. I forgot I have add the NonMaxSuppression to relax front in our code repo.
The follow script try to loading the NMS Op by reusing the relay frontend and transforming with relay_translator(),still the similar error occur in BufferVar Dtype.
the NonMaxSuppression Op is quite important in CV models, due to my limited ability, there are some bugs in the added NMS OpConverter in relax frontend.
the following is my implemented version:
class NonMaxSuppression(OnnxOpConverter):
"""Operator converter for NonMaxSuppression."""
@classmethod
def _impl_v10(cls, bb, inputs, attr, params):
# Get parameter values
boxes = inputs[0]
scores = inputs[1]
max_output_boxes_per_class = inputs[2]
iou_threshold = inputs[3]
score_threshold = inputs[4]
boxes_dtype = boxes.checked_type.dtype
if attr.get("center_point_box", 0) != 0:
xc, yc, w, h = bb.normalize(relax.op.split(boxes, 4, axis=2))
half_w = w / relax.expr.const(2.0, boxes_dtype)
half_h = h / relax.expr.const(2.0, boxes_dtype)
x1 = xc - half_w
x2 = xc + half_w
y1 = yc - half_h
y2 = yc + half_h
boxes = bb.normalize(relax.op.concat([y1, x1, y2, x2], axis=2))
if iou_threshold is None:
iou_threshold = relax.expr.const(0.0, dtype="float32")
if score_threshold is None:
score_threshold = relax.expr.const(0.0, dtype="float32")
def conditionally_squeeze_scalar(x):
rank = len(x.struct_info.shape)
assert rank <= 1, "nms thresholds must be scalars"
if rank == 1:
return relax.op.squeeze(x, [0])
return x
max_output_boxes_per_class = conditionally_squeeze_scalar(max_output_boxes_per_class)
iou_threshold = conditionally_squeeze_scalar(iou_threshold)
score_threshold = conditionally_squeeze_scalar(score_threshold)
# fix: ICHECK(ptr) << "The struct_info is not populated, check if you have normalized the expr";
max_output_boxes_per_class = bb.normalize(max_output_boxes_per_class)
iou_threshold = bb.normalize(iou_threshold)
score_threshold = bb.normalize(score_threshold)
nms_out = bb.emit_te(topi.vision.all_class_non_max_suppression, boxes, scores, max_output_boxes_per_class, iou_threshold, score_threshold)
begin = relax.const([0, 0], dtype="int64")
end = bb.normalize(relax.op.concat([nms_out[1], relax.const([3], dtype="int64")]))
strides = relax.const([1, 1], dtype="int64")
return bb.normalize(relax.op.dynamic_strided_slice(nms_out[0], begin=begin, end=end, strides=strides))