TVM Monthly - August 2024

As discussed by the TVM PMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

RFCs

None

We continue to improve Relax, TIR, Frontend and other Runtimes .

BugFix

  • #17307 - [Fix][TIR] LowerThreadAllreduce warp reduction mask
  • #17229 - [Cutlass] fix cutlass instantiate attention template bugs

CI

  • #17271 - Resolve CI compilation failures on MacOSX

Disco

  • #17275 - Fix double free of nccl communicator
  • #17264 - Disable splitting nccl communicator in single-group

Dlight

  • #17259 - [ADRENO] Fix for opencl adreno matmul schedule

Docs

  • #17306 - [Doc] Refactor How-To
  • #17296 - [Doc] Overview
  • #17298 - [Doc] IRModule
  • #17286 - Introduce Relax API and move legacy part to standalone page
  • #17289 - [Doc] Quick Start
  • #17287 - [Doc] Refactor install docs

Frontend

  • #17277 - [Relay][Pytorch] Add support for aten::tile

OpenCL & CLML

  • #17273 - [CODEGEN][OPENCL] Fix opencl codegen for few ops

ROCm

  • #17295 - Fix non-standard rocm path
  • #17290 - hipBLAS integration
  • #17256 - Support ROCm 6

Relax

  • #17309 - [Frontend][Onnx] fix expand bug in onnx frontend
  • #17313 - Identify tuple unpack/repack in CanonicalizeBindings
  • #17312 - [Bugfix] Infer TIR values from shapes inside a tuple
  • #17304 - [PyTorch] Add support for torch.repeat
  • #17305 - [Python]Rotary positional embedding scaling
  • #17243 - Avoid wrapping TupleStructInfo into a Tuple for R.call_tir
  • #17292 - [Bugfix] Support torch.unbind op and fix bugs for expand && split
  • #17291 - [PyTorch] Add support for torch.tile
  • #17224 - [Analysis] Handle recursive functions in CollectVarUsage
  • #17280 - [KVCache] Increase coalesce threshold
  • #17263 - [Bugfix] Preserve dtype in ToMixedPrecision for kNever ops
  • #17261 - Add KVCache Interface for Relax NNModule
  • #17145 - Implement R.ensure_zero_offset and update memory planning for R.view
  • #17242 - Remove segfault in R.call_tir_inplace validation
  • #17228 - [Unity][Frontend] Add Sqrt Op
  • #17234 - FuseTransposeMatmul Pass

Runtime

  • #17294 - Support KV cache with RoPE extension factor array
  • #17240 - [FFI]Use TVMValue::v_int64 to represent boolean values
  • #17252 - Revert “[FFI]Introduce runtime boxed types for int/float/bool”
  • #16183 - [FFI]Introduce runtime boxed types for int/float/bool
  • #17237 - Reorganize PagedKVCache attn kernel invocation
  • #17227 - Allow aborting fetchWithCache through AbortSignal

TIR

  • #17219 - Validate tir::Buffer axis_separators on construction

TOPI

  • #17274 - [ADRENO] Add Group Conv2d texture schedule

web

  • #17251 - Add TVMArgBool to ArgTypeCode

Misc

  • #17301 - [TE][CreatePrimFunc] Fix create reduce block with spatial iter dependent init value
  • #17284 - [Support] Fix the Read/Write of socket stream
  • #17302 - [Codegen][WebGPU] LetNode common subexpr override
  • #17246 - [Cleanup] Remove using namespace tvm::runtime from headers
  • #17278 - [Codegen] Emit tir::Let as var assignment explicitly
  • #17260 - [WINDOWS] Compiler options for non x86 targets
  • #17249 - [IR] Handle NaN in StructuralEqual and StructuralHash
  • #17257 - [FFI] Re-introduce the boxed primitive values
  • #17265 - [CompileBugfix][contrib] meet ‘base64.h: No such file or directory’ and ‘‘tvm::runtime::vm::AllocatorType’ has not been declared’ while compiling
  • #17214 - Replacing unary ops with LookUpTable and Take op to improve performance
  • #17250 - [WebGPU] Fix unexpected device lost error when intentional dispose
  • #17236 - [3rdparty] Bump FlashInfer
  • #17233 - [Runtime Patch] Add AbortSignal to fetchWithCache in ArtifactCacheTemplate interface
1 Like