Apace TVM v0.9.0 Release

The Apache TVM community is happy to announce the release of TVM v0.9.0.

The TVM community has worked since the v0.8 release to deliver many exciting features and improvements. v0.9.0 is the first release on the new quarterly release schedule and includes many highlights, such as:

  • MetaSchedule: all modules fully upstreamed, completed support for TIR, TE, Relay, ONNX tuning; schedule rule & postproc support for AutoTensorization; customizable profiling and logging; CUDA thread auto-binding; cpu layout rewrite; improved tuning interface
  • ARM cascading scheduler for Arm Ethos™-U NPUs
  • Collage which brings tuning to BYOC
  • Several microTVM improvements
  • New tvm.relay.build parameters: runtime=, executor=,
  • AOT: support for the C++ runtime (with llvm and c targets only) and support for host-driven AOT in the C runtime
  • Hexagon RPC support
    • Testing via Hexagon SDK simulator and on device via Snapdragon-based HDK boards and phones
    • AOT and USMP support
    • Threading
    • Initial op support
  • MLF: support for multiple modules in a single MLF artifact
  • microTVM: host-driven AOT support
  • Several TIR schedule primitives and transforms including (abridged):
    • schedule.transform_layout - Applies a layout transformation to a buffer as specified by an IndexMap.
    • schedule.transform_block_layout - Applies a schedule transformation to a block as specified by an IndexMap.
    • schedule.set_axis_separators - Sets axis separators in a buffer to lower to multi-dimensional memory (e.g. texture memory).
    • transform.InjectSoftwarePipeline - Transforms annotated loop nest into a pipeline prologue, body and epilogue where producers and consumers are overlapped.
    • transform.CommonSubexprElimTIR - Implements common-subexpression elimination for TIR.
    • transform.InjectPTXAsyncCopy - Rewrites global to shared memory copies in CUDA with async copy when annotated tir::attr::async_scope.
    • transform.LowerCrossThreadReduction - Enables support for reductions across threads on GPUs.
  • TVMScript
    • Basic metaprogramming features: Template meta-programming over variables; Support inlined call to Python function from TVMScript.
    • Ergonomic improvement: Allowing assignment without type annotation; Improvement on Python type hint.
  • And many more! See the list of RFCs and PRs included in v0.9.0 for a complete list, as well as the full change list.

Check out the release notes on GitHub and the source download here.

3 Likes