In our optimizations we are heavily dependent on the bound information of the expressions. In particular, the sign information is very useful in deciding simplifications and final lowering.
Currently, we usually have to get the bound information of variables from its context, or simply lost the the bound information during lowering
Example Problem
Consider the following code, which flattens an array.
n = tvm.var("n")
A = tvm.placeholder((n, n), name="A")
B = compute((n * n), lambda i: A[floormod(i, n)][floordiv(i, n)])
for (i, 0, n * n) {
B[i] = A[floormod(i, n)][floordiv(i, n)]
}
The key question here is how can be lower floormod(i, n)
and floordiv(i, n)
operations. Due to different division semantics. The most effective way is to directly lower floormod(i, n) => truncmod(i, n)
which directly corresponds to i % n
operation in C. However, in order to do this, we will need to know that i >= 0
and n >= 0
.
The current analyzer can try to get the context information of i
from its loop body. However, it also depends on how a developer sets it up. If we forget to populate the context information of i
and directly run the logic below, then the analyzer do not know that i
is actually non-negative.
FreshAnalzyer().RunLowerFloor(floordiv(i, n))
If we instead carefully populates the context information via recursive visiting, then we can take benefit of i
's bound information.
However, in the above example, the even worse case is that the lowered code no longer hints that n
is non-negative, which it should be, because it represent shape of a tensor. Without this information, we can no longer efficient generate lowered code for floordiv(i, n)
.
Note that this is only one example that illustrates the importance of the additional bound information. This example uses the floordiv/mod, so the challenge arises in the final lowering step. If we use other division mode, such as truncdiv which directly corresponds to the low level instruction, the problem will occur during simplication, where the simplifier do not know about the corresponding bounds.
Summary of Issues
To summarize, there are two issues we want to address:
- Generate useful bound information for symbolic variables, so we can do effective lowering
- Sometimes a developer forget to populate context information, making the analysis less effective(due to lack of these information)
Possible Solutions
The good news is we do have ways to solve the above problems, this RFC is mainly to discuss which way(or ways) we would like to adopt to make our life easier.
Proposal 1: Generate AssertStmt When Binding Shapes
First of all, because we know n
is a shape, we already know that it is not negative. We can take benefit of this when generate binding information for shapes in ArgBinder
However, this information was not propagated to the codegen phase. We can generate assertion, via AssertStmt, when binding those variables. So the eventual low level code becomes the following code.
assert n >= 0
for (i, 0, n * n) {
B[i] = A[floormod(i, n)][floordiv(i, n)]
}
Now the context-dependent analysis can catch the bound information of n and do successful lowering.
Pros of this approach
- This is perhaps the most light-weight approach.
Cons of this approach
- If we do analysis before the arg binding phase, the non-negative information of
n
is not available. - If we want to do context-independent analysis, this information is not available.
Proposal 2: Embed Bound Information into Expression
Both this and the next proposal tries to embed bound information directly into the expression, besides the context. For example, we can introduce an NonNegExpr expression, which returns the value and asserts that the value is non-negative.
class NonNegExpr : public Expr {
public:
Expr value;
};
When we create tvm.placeholder(shape=(n, n))
, we actually store them as
(nonneg(n), nonneg(n))
. That is, we actively add non-negative hint wrapper to the stored expression. Because the expression have the non-negative hint, we can do the analysis easily.
Of course, depends on what we want, we could introduce more complicated expressions, such as AssertExpr(cond, value)
, which returns value while asserting condition is true
. Or BoundConstraintExpr(value, const_int_lower_bound, const_int_upper_bound)
.
Main advantage of embedding information into expression.
- Because the bound information is embedded in the expression, we can do analysis without even populate the context information. This is a huge advantage over the previous case
- The bound assertions are attached when we pass things into shape constructors, user do not have to think too much about it.
Main drawback of this approach
- Because the bound assertion is an nested expression, it is hard for certain simplification cases.
For example, the simplification engine cannot deal with the following case.
nonneg(x * y) / nonneg(x)
Proposal 3: Embed Bound Information into Variable
A more drastic change to the IR structure, is to embed the constraint information into the expression. This can be done by either introduce bound or non-negative flag to the Variable, or make a subclass of Variable
class ShapeVar : public Var {
public:
int64_t const_lower_bound = -kPosInf;
int64_t const_upper_bound = +kPosInf;
};
For loop variables, we can unify IterVar as a sub-class of Var, and the range field will be able to provide these information (this seems to might be a good change regardless, except that it will result in code refactor)
Pros of this approach:
- Same as proposal 2, we can do context-independent analysis
- We won’t have the problem of simplifying nested annotation cases as in proposal 2.
Cons of this approach:
- A developer need to be mindful about declaring the bound of the variable. For example, we now might need to write:
n = tvm.var("n", nonneg=True)
- What if a developer forget to do so, and still uses the normal variable for shape, in theory we allow this case as well, as the additional constraints just helps to generate better, code, should we do a rewriting phase to embed information back, or should we shoot an warning?
- We do need to create a new Variable(with new bound information), as we rewrite to a smaller region of the loop.
Proposal 4: Same as Proposal 2, But only Wraps Variables
This avoids cases like nonneg(x * y)
, because we won’t do auto wrapping for these cases.
Things to Discussions
As we can see, embedding information into the Vars or Exprs bring the benefit of additional analysis without dependent on the context. That does mean, however, we need to choose to recreate the Var with every time we might want to update the hint.
These information can complement the context-dependent analysis. For example, we could declare a variable with bound [0, 100]
, but then get additional constraints based on the loop or assertions.
Things to discuss:
- Which proposal do you like
- Shall we make IterVar as subclass of Var
- For the additional constraint information, regardless of the nested expression, or embed into Variable, what should that constraint field be like.
- (a) It can be a simple
bool nonneg
flag, as this is the most important usecase. - (b) It can be a
int64 const_lower_bound
and upper bound pair. The upper bound might be useful for memory analysis in dynamic shapes. - © It should be
AssertExpr(any_condition, value)
, because this is the most general version. - Of course this is a trade-off. We could always introduce combinations, like (b) primarily but also support ©
- (a) It can be a simple
- Should we build nested expression or embed it into the Variable(perhaps as sub-class)
- In the case of nested expression, should we allow nesting on non-variables. which is less useful and results in problems like ```nonneg(x*y)/nonneg(x)````
During discussion, please keep in mind that we are making tradeoffs here, regardless of which choice, I feel pushing it would make the analysis and codegen towards a better direction.
Please share your thoughts