Hi, I have some confusion in quantization scheme,
def quantize(graph, params=None, dataset=None): ‘’’ dataset: list of dict of Var → NDArray The calibration dataset. ‘’’
Just shown in comments, where is the ‘dataset’ been used?
Besides, in calibrate function, it wrote
if kind == QAnnotateKind.WEIGHT: var = expr.args[0] assert isinstance(var, _expr.Constant) scale = power2_scale(var.data) else: scale = cfg.global_scale
valid_range = 2**valid_bit const_params[ndom_scale] = _make_const(scale / valid_range) …
and cfg.global_scale is 8.0 by default, which means ndom_scale of INPUT/ACTIVATION have the same value(0.0625). How did it work? I mean in other framework like TF, it uses calibration datasets and EMA algorithm to estimate appropriate scale. It doesn’t seem that tvm use this manner.