tico
August 28, 2019, 11:07am
1
Hi,
I am trying to quantize a model which is originally in NHWC, so in order to be able to quantize it I set the target data layout to NCHW. However, as discussed in other threads, change in the data layout implies that transpose operators are added. The problem is that the transpose and also nn.pad operators are added in between the chain of convolutions and since the transpose operator is not quantized in TVM, there are many casting operators from float to int along the chain of convolutions.
What can be done to fix this behavior? How difficult would be to quantize transpose operator?
@vinx13 @ziheng Could you please give me a hint here?
Thanks
vinx13
August 29, 2019, 5:44pm
2
transpose is easy as it only need identity rewrite
see
ret = _forward_op(ref_call, [data, kernel])
return QPartitionExpr(ret)
def identity_partition_function(ref_call, new_args, ctx):
cond, expr = partition_expr_check(new_args[0])
if cond:
return QPartitionExpr(_forward_op(ref_call, [expr]))
return None
register_partition_function("clip", identity_partition_function)
register_partition_function("nn.relu", identity_partition_function)
register_partition_function("nn.max_pool2d", identity_partition_function)
def add_partition_generic(ref_call, new_args, ctx):
"""Rewrite function for ewise add for partition for generic devices"""
lhs_cond, lhs = partition_expr_check(new_args[0])
rhs_cond, rhs = partition_expr_check(new_args[1])
if lhs_cond and rhs_cond:
# - introduced by ResNet, when for the first residual connection
return None
x_expr, x_kind = _get_expr_kind(new_args[0])
if x_kind is None:
return None
ret_expr = _forward_op(ref_call, [x_expr])
return QAnnotateExpr(ret_expr, x_kind)
register_annotate_function("clip", identity_rewrite)
register_annotate_function("nn.relu", identity_rewrite)
register_annotate_function("strided_slice", identity_rewrite)
register_annotate_function("nn.avg_pool2d", identity_rewrite)
register_annotate_function("annotation.stop_fusion", identity_rewrite)
def pool2d_rewrite(ref_call, new_args, ctx):
"""Rewrite function for max pool2d"""
if quantize_context().check_to_skip(ref_call):
return None
Expr ret = ForwardOp(ref_call, {n->data});
return QRealizeIntExprNode::make(ret, n->dom_scale, n->dtype);
}
CHECK(!new_args[0]->derived_from<TempExprNode>());
return Expr(nullptr);
}
RELAY_REGISTER_OP("nn.relu")
.set_attr<FForwardRewrite>("FQRealizeRewrite", IdentityRealize);
RELAY_REGISTER_OP("strided_slice")
.set_attr<FForwardRewrite>("FQRealizeRewrite", IdentityRealize);
RELAY_REGISTER_OP("annotation.stop_fusion")
.set_attr<FForwardRewrite>("FQRealizeRewrite", IdentityRealize);
/* \brief for unary operators which requantize its input to dtype_nbit */
Expr CastDtypeInputRealize(const Call& ref_call,
const Array<Expr>& new_args,
const NodeRef& ctx) {
const QConfig& cfg = QConfig::Current();
pad need a custom rule in realize to pad with quantized-type (i.e. int) value
tico
September 2, 2019, 7:07am
3
Hi @vinx13 , thanks for the pointers!
I have a couple of further questions:
Could you please give further hints on the pad operator regarding the custom rule?
What about quantizing dense layers? I think I saw some code regarding this. Is this already supported?
Thanks
tico
September 2, 2019, 11:42am
4
BTW, Can reshape be also implemented with identity realize?
vinx13
September 2, 2019, 8:21pm
5
For pad, you need to implement PadRealize
Quantizing dense is supported
Reshape can be implemented with identity realize