As discussed by the TVM PMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.
Feedback and suggestions are welcomed so that we can further improve these updates.
RFCs
None
We continue to improve Relax, TIR, Frontend and other Runtimes .
BYOC
- #16567 - Skip processed functions in FuseOpsByPattern and RunCodegen
BugFix
- #16649 - [FFI] Add a missing default for datatype lanes
- #16492 - [Executor] fix debug_executor function debug_get_output
- #16598 - [Transform]Handle non-composite lambda functions in FuseOps
- #16565 - [Transform] Keep private non-primitive functions in FuseTIR
- #16518 - Use xxx instead of pow(x,3)
CI
- #16611 - [AOT][Testing] Print output values on test failure
- #16546 - Disable testing that downloads from mxnet
- #16521 - Fix CI Script and Broken Tests
- #16502 - Support tvm-bot rerun for tvm-unity task
Docs
-
#16610 - [Doc] Fixed Docstring usage example in
tvm.ir.make_node
- #16572 - [Doc] Remove MxNet related tutorials
-
#16514 - [Unity][Doc] Document passes that depend on
DataflowBlock
s and encourage usingConvertToDataflow
Frontend
- #16604 - [Relax][Onnx] fix clip unsqueeze opset implement
- #16616 - [PaddlePaddle] Support conv2d when data_format is NHWC
- #16526 - [Keras] Enable Dense operator for any input dims
LLVM
- #16612 - [SVE] Add support for scalable data type strings
- #16523 - [SVE] Change the dtype of Ramp and Broadcast lanes to PrimExpr
Metal
- #16605 - [RUNTIME]Fix multithreading access of metal runtime
ROCm
- #16550 - [RUNTIME]Properly align rocm parameter buffer
Relax
- #16591 - [Unity][Transform] Handle dynamic shapes in CombineParallelMatmul
- #16594 - [Transform] Preserve param names in LiftTransformParams
- #16575 - [Unity] GPU sampling
- #16574 - Additional unit tests for RemoveUnusedParameters
- #16585 - [Unity][Analysis] Include impure call in VerifyWellFormed errors
- #16421 - [Unity][Transform] Raise error in FuseOpsByPattern for SSA violation
- #16629 - Fix error message in BlockBuilder
- #16592 - Handle dynamic arguments in legalization of nn.attention
- #16590 - [Unity][Transform] Check for permute_dims in ExpandMatmulOfSum
- #16563 - Implement operators to read runtime DLTensor* information
- #16581 - [Unity][MSC][M4.2][Step2] Enable plugin with manager, test plugins in compile pipeline
- #16600 - Expose name_hint field for BlockBuilder.match_cast
-
#16601 - [Transform] Canonicalize
let var = R.const
bindings - #16583 - [Unity][VM] Recursively visit match bindings in VMShapeLowerMutator
- #16586 - Ignore non-relax functions in relax.transform.RunCodegen
- #16573 - [VM] Re-implementation of callback functions
- #16561 - [Bugfix]Remove call to tvm.build for empty TIR module
- #16564 - [Unity] Check for symbolic vars in PrimValue in when lowering to TIR
- #16558 - Minor updates for NN frontend
- #16542 - Support callback as argument
-
#16487 - [Unity][Transform] Handle
call_tir_inplace
inFuseTIR
andFuseOps
- #16355 - [Unity] Infer struct info for relax.op.split on dynamic-sized index
- #16465 - [Redo][Unity] Split DecomposeOpsForTraining into two steps
- #16495 - [Unity][MSC][M4.2][Step1] Enable plugin with manager, test plugins in compile pipeline
- #16498 - [Frontent] “tensor_ir_inplace” op
- #16500 - [Unity] Support storage reuse for dynamic shapes
Relay
- #16622 - [ONNX] Fix the attribute mode parse of operator Upsample
- #16626 - [ONNX] Fix the Resize operator in ONNX frontend
- #16624 - [ONNX] fix the wrong default value about dtype in Multinomial converter
Runtime
- #16635 - [RPC] Enable RPCObjectRef over multi-hop RPC
- #16630 - Add TVM_DLL to threading backend funcs
- #16568 - [Relax]RNNState for Space State Models
- #16541 - Add “TVM_DLL” to NDArray cache load func
- #16545 - Fix dtype conversion for bf16 and fp8
- #16508 - ParallelFor skipping thread backend for unit extent
TIR
- #16544 - Expand debug symbol output for CodeGenLLVM
- #16553 - Fix get_block_access_region for let bindings
- #16515 - Require exactly same-dtype matching for Vulkan smem reuse
TVMScript
- #16640 - Represent tir::builtin::ret() using python “return”
- #16562 - [Bugfix]Handle R.match_cast as last binding in if/else
- #16593 - [Unity]Parse R.Object return type from call_pure_packed
- #16356 - [Unity]Optionally hide StructInfo that can be inferred
cuda & cutlass & tensorrt
- #16619 - [Bugfix][Cutlass] Check if function attributes is None
micoNPU
- #16401 - [microNPU][ETHOSU] Add fixed point for matmul
web
- #16631 - Fix NDArrayCache loading report callback
- #16525 - Move ArtifactCache to Interface, Support Cache delete and Batch Delete, Remove typo
- #16554 - Compatibility with PagedKVCache in WebGPU
- #16527 - Revert “[Unity]Temp disable wasm exception (#16444)”
- #16504 - [Relax]Add ApplyPresenceAndRequencyPenalty
Misc
- #16595 - [Transform] Check for zero-param operators in LiftTransformParams
- #16639 - [Disco] Expose functions to query the per-worker device/rank
-
#16617 - [Disco] Implement
Session.import_python_module
method - #16599 - [Transform] De-duplicate MatchCast nodes in EliminateCommonSubexpr
- #16596 - [Transform] Implement relax.transform.ReorderPermuteDimsAfterConcat
- #16597 - [Transform] Allow explicit name of bundled model parameters
- #16602 - [Transform] Improvements to LazyTransformParams
- #16579 - [Dlight] Scheduling Low batch GEMM using GEMV-like rule
- #16606 - [KVCache] Support passing in attn_score_scaling_factor into KV cache
- #16608 - Extend gpu memory bandwidth test to work through RPC
- #16587 - [Debug] Improve error message for codegen pattern mismatches
- #16570 - [Marvell BYOC]: Marvell AI Accelerator Integration - Phase 1
- #16576 - Update the 3rdparty/libflash_attn submodule
- #16580 - [KVCache] Support mode “None” for Rotary Embebdding
- #16578 - [KVCache] Support returning query positions
- #16571 - Fix compile warnings
- #16540 - [Upd] Enable lld search to include /opt/rocm/llvm/bin for rocm
- #16539 - Improve error message in NDArray::CopyFromTo
- #16524 - [Build] Improving debug and build-dir options
- #16551 - [KVCache] Fix attention kernel for ROCm
- #16512 - Cut pytest-lazy-fixture
- #16506 - Bump 3rdparty/cutlass_fpA_intB_gemm version
- #16511 - [Minor] Fix Clang compilation warning in fuse_tir.cc and codegen_c_host.cc
- #16516 - Add Relax, Unity Tags in make_notes.py
- #16497 - [Instrument] Add default instrument to print all passes
- #16494 - [DPL] Support tir_vars field in is_call_tir pattern