The first issue was already supported in TVM, just use Var
as you mentioned (btw, v0.8.0 is a little bit out-dated, the version has bumped to v0.14).
The second issue is about data-dependent operators such as nonzero
and spgemm
. My opinion is that it should be handled at higher-level IRs such as relax.
The generic workflow of a data-dependent operator could be:
- Call an operator (
f_estimate
) to estimate the memory requirement of the output buffer. - Allocate enough memory for the output buffer.
- Call the compute operator (
f_compute
) to perform actual computation.
After such 3-step decomposition, each operator’s output buffer size is determined before its execution, and both f_estimate
and f_compute
can be described by TensorIR.
One example is cuSPARSE’s cusparseSpGEMM
operator, there are three APIs:
cusparseSpGEMM_workEstimation
cusparseSpGEMM_estimateMemory
cusparseSpGEMM_compute
They are responsible for estimating buffer size and performing actual computation, correspondingly.
We can design a construct such as call_tir_data_dependent
in Relax for such workflow:
output = R.call_tir_data_dependent((f_estimate, f_compute), args, ...)