apache:unity
← slyubomirsky:purity-tracking
opened 03:38AM - 24 Mar 23 UTC
In this PR, I am beginning to implement the tracking of function purity as part …of the `StructInfo` system. This will allow the compiler to enforce that no impure function (one that can possibly have visible side effects) can be called in a `DataflowBlock`. Tracking this requires noting which operators are pure or impure (I am presently doing this with an operator attribute called `FPurity`, a simple boolean), though dealing with calls to other Relax functions means that this information must also be tracked via the `StructInfo` system.
Additionally, it is difficult to infer the purity of a function in the general case (when there are calls to other Relax functions), so this change does require users to annotate impure functions using a function attribute (`IsPure`). Since most Relax functions are likely to be pure, this will hopefully not be a large imposition on users. We can consider inferring purity in the easier cases, since those are likely to be common.
As an "escape hatch" to the purity system, this change also adds an operator called `call_pure`, which wraps a call to an impure function and indicates to the compiler that the call should be treated as pure. Additionally, a function can be labeled with the `ForcePure` attribute, which indicates to the compiler to treat the entire function as pure even if it contains an impure call. These can be used to deal with the following situations:
1. Some function or operator can have side effects in _some_ situations, but the user knows for a fact it won't in _this_ one. `call_pure` would be useful here.
2. A function does side effects but only on a value that will not be exposed anywhere else or on a new value that will be returned. Even though the individual actions are "impure," the overall function fulfills the definition of being pure. `ForcePure` would be useful here.
Changes include:
* Enforcing that impure functions are not used in `DataflowBlock`s in the well-formed check.
* Enforcing that functions that are not labeled impure do not contain impure calls (unless labeled with `ForcePure`).
* Implementing the `call_pure` operator
Still to be done:
- [x] Label purity for all operators
- [x] Address certain tricky bugs
(Also address the design concerns below)
----
The process of implementing purity tracking has revealed a few issues that may require some further design discussion.
## Labeling purity or impurity
Using function attributes to label purity/impurity seems very messy. It might be worth making this part of the `Function` node's AST to avoid having to wrangle attributes in many places in the codebase.
## The treatment of `call_pure`
`call_pure` presents a dilemma because many passes look for calls to certain operators, but with `call_pure`, these would be "wrapped" like so: `Call(Op("relay.call_pure"), [inner_op, arg1, ..., argn], attrs, sinfo_args)`. In the posted PR, many passes needed to be revised to look for instances of `call_pure` so that they could be treated the same as calls to ordinary operators. It might cause less disruption to passes to turn `call_pure` into a specialized AST node that literally wraps a call node. This way visitors or mutators could have an easy default case and the lower-level code generation passes could simply ignore the `call_pure` node and deal with the wrapped call. As painful as introducing a new AST node would be, it would likely be more maintainable than having every pass have to have a special case for the `call_pure` operator.
## Staging
Related to the above issue, having purity tracking in the well-formedness analysis means that even low-level passes like `VMLowerBuiltin`, which replaces some Relax operators with `PackedFunc`s (treated as impure by default, as they are in principle dangerous black boxes), have to insert calls to `call_pure` to avoid triggering purity errors. Perhaps it might be useful to ignore purity by the time we reach lower stages of compilation, much as the `ToNonDataflow` pass is used in the VM build process to eliminate `DataflowBlock`s.