[RFC] Verification and (possibly) program synthesis of expression rewriting rules

yzh119 · March 16, 2022, 12:40am

Currently, TVM uses the following way to define expression simplifying rules:

apache/tvm/blob/f5e0c102057641d88f06ad865d5a1d4e99bd70d7/src/arith/rewrite_simplify.cc

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

/*!

This file has been truncated. show original

This approach is error-prone and not scalable:

The rewrite rules were added manually, the number of possible rewriting rules might be too large and not maintainable.
Lack verification of rewriting rules, we only have unittests: which can not cover all possible cases.
The order of rewrite matters, which is not considered here.

For 2, we can use existing SMT solvers such as Z3, for the issue 10610 (introduced in pr 10591), we can try finding a counter-example with Z3:

>>> from z3 import *
>>> x = BitVec('x', 32)
>>> y = BitVec('y', 32)
>>> c1 = BitVec('c1', 32)
>>> c2 = BitVec('c2', 32)
>>> solve(x > 0, y > 0, c1 > 0, c2 > 0, c2 % c1 == 0, (x * c1 + y) % c2 != (x % (c2 / c1)) * c1 + y, y < c1)
[y = 79521, x = 201523190, c1 = 1909062, c2 = 1956788550]

where BitVec(..., 32) means int32.

We can write tests for all these rewriting rules either on C++ side or Python side.

How to automatically generate rewrite rules remains a research problem, program synthesis (rosette) and equality saturation (egg, ruler) might help here. However, we need to define the optimization target of rewriting (make it simpler? can we formulate the level of “simple”? etc).

masahi · March 16, 2022, 12:56am

Halide people did something like that https://dl.acm.org/doi/10.1145/3428234