A Demo: E=A+B,F=E+A, there are two output E、F,and I want to return E、F。so I write a demo like this: `
shape = (10,1024)
A = tvm.placeholder(shape, name="A", dtype="float16")
B = tvm.placeholder(shape, name="B", dtype="float16")
E = tvm.compute(A.shape, lambda *index:A(*index)+ B(*index), name = "E")
F = tvm.compute(E.shape, lambda *index:E(*index)+ B(*index), name = "F")
sch = tvm.create_schedule([F.op])
AL = sch.cache_read(A, "local", [E])
BL = sch.cache_read(B, "local", [E, F])
EL = sch.cache_write(E, "local")
FL = sch.cache_write(F, "local")
with build_config:
print(tvm.lower(sch, [A, B,E,F], simple_mode=True))
`
we will get this IR:
we can get output E、F in global, but in produce F.local, it use global var E when compute, but this compute is is in local, if build, will be wrong.
if we add
sch[E].compute_inline()
we will get such IR:
this IR , the compute in produce F.local is right, it use all var is in local. but the output E is inline, so we only can get one out F. this is not what we want.
So how to get the right IR and return muti-output?