Hi, I want mimic the behavior of qunaitzation, but my application is quit simple, only 2 functions, so I want test on add first, to gen a int8 .so , and try inference on float32 data.
here is how I gen:
tgt = tvm.target.Target(target="llvm", host="llvm")
# dtype = 'float16'
dtype = 'int8'
n = te.var("n")
ph_a = te.placeholder((n,), dtype=dtype, name="ph_a")
ph_b = te.placeholder((n,), dtype='float32', name="ph_b")
ph_c = te.compute(ph_a.shape, lambda i: ph_a[i] + ph_b[i], name="ph_c")
sched = te.create_schedule(ph_c.op)
fadd_dylib = tvm.build(sched, [ph_a, ph_b, ph_c], tgt, name="vector_add")
so that, I have a add in int8.
What if I want inference on float32 data, how should I do that?