[meta-schedule][tensorization]BERT tuning runtime error

Hello, guys. I tried to tensorizing and tuning the BERT model using meta-schedule. But I ran into a runtime problem as follows:

E         7: tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::VisitStmt(tvm::tir::Stmt const&)
E         6: _ZZN3tvm3tir11StmtFunctorIFNS0_4StmtERKS2_EE10InitVTableEvENUlRKNS_7runti
E         5: tvm::tir::ThreadBindingUnifier::VisitStmt_(tvm::tir::ForNode const*)
E         4: tvm::tir::StmtMutator::VisitStmt(tvm::tir::Stmt const&)
E         3: tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::VisitStmt(tvm::tir::Stmt const&)
E         2: _ZZN3tvm3tir11StmtFunctorIFNS0_4StmtERKS2_EE10InitVTableEvENUlRKNS_7runti
E         1: tvm::tir::ThreadBindingUnifier::VisitStmt_(tvm::tir::ForNode const*)
E         0: tvm::tir::Stmt tvm::tir::ThreadBindingUnifier::UnifyThreadBindingImpl<tvm::tir::ForNode>(tvm::tir::ForNode const*, tvm::tir::Var const&, tvm::tir::Iter
Var const&, tvm::Range const&)
E         File "~/code/tvm-main/src/support/parallel_for.cc", line 128
E       RuntimeError: parallel_for_dynamic error with [17:22:12] ~/code/tvm-main/src/tir/transforms/unify_thread_binding.cc:112: Check failed: (ana.CanPro
veEqual(dom->extent, new_iter_var->dom->extent)) is false: ValueError: All loops that are bound to threadIdx.y should have the same extent. However, there are two
 loops with extent T.int64(2) and T.int64(1), which are not equal

~/tvm/python/tvm/_ffi/base.py:481: TVMError

The script is tvm/tests/python/integration/test_auto_tensorize.py. Can anybody give me some advice?

Hi, The error is not related to runtime, just because TVM use the same code for parallel execution both in compilation time and runtime. I haven’t tried the BERT test case, but I found there is a bug about meta-schedule tensor-core policy supporting batched matmul. Here is the fix, you may have a try. [BugFix][MetaSchedule] MultiLevelTilingTensorCore generates inconsistent thread-binding sketch for batched matmul by tsu-bin · Pull Request #17012 · apache/tvm · GitHub

Have a nice day

Hi, thank you for your reply. When tuning BERT, the error is raised exactly when it tries to tune Batch_MatMul. And I just tried your solution and it worked! Really appreciate you for your help!