As discussed by the TVM PMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.
Feedback and suggestions are welcomed so that we can further improve these updates.
RFCs
None
We continue to improve Relax, TIR, Frontend and other Runtimes .
BugFix
- #17142 - Allow import of TVM when current directory is read-only
- #17138 - [Fix][TIR] Fix outdated call to create extern buffer in make_extern
- #17132 - Restrict CopyOnWrite to _type_final
CI
- #17221 - Reduce logging level when checking if docker image exists
- #17206 - Update dummy-variable regex for pylint
- #17117 - [CLML]Fix for few clml regression issues
-
#17155 - Remove lint step from
unity/pr-head
step
Disco
- #17182 - Implement SocketSession
- #17191 - Cross-group and p2p send/receive primitives
- #17180 - Group-wise operation
Dlight
- #17187 - [GPU] Add OpenCL dequant matmul schedule
Docs
- #17146 - [DOC] Fix typo for the “We utilize the intermediate representation of nn.Graph to convert the OneFlow model to Reley.”
Hexagon
- #17204 - Fix LWP assembly handler (predicate register)
- #17169 - [CMake] Fix v66 build issue
- #17162 - Support RPC execution of existing shared lib
- #17123 - Add support for v75
LLVM
- #17199 - Fix for getHostCPUFeatures API change
MetaSchedule
-
#17166 - Replace
xgboost.rabit
withxgboost.collective
because it’s deprecated - #17171 - Add a testcase for padded conv2d in meta_schedule
ROCm
- #17141 - [Backend]Fix error when building TVM with LLVM 19
Relax
-
#17201 - [Transform]Handle
is_group
argument in IPC AllReduce - #17198 - Disable fusion for fetching from the packed params in FuseOps
- #17149 - Implement Rewriter class for pattern-rewrite
-
#17189 - [PyTorch] Add support for
torch.nn.functional.max_pool2d
- #17192 - [KVCache] Partial layers support
- #17186 - [PyTorch] Add support for torch.einsum
- #17184 - [PyTorch] Add support for torch.permute
- #17157 - Integrate cuDNN attention
- #17167 - [ONNX] Add support for Sign and Not
- #17121 - [BugFix] Fix a bug about the IR construction in test file
- #17160 - Fix fuseOps via pattern
- #17139 - Fix cublas dispatch for corner cases
- #17127 - [KVCache] Support fork in sliding window sink part
Relay
- #17177 - [FQ2I]: Use appropriate dtype while quantizing relay.op.nn.pad…
Runtime
- #17208 - Allow aborting fetchNDArray through AbortSignal
TIR
-
#17158 - [Analyzer] Simplify
x==x
expressions for all dtypes -
#17134 - [Schedule] Remove
@type_check
forset_axis_separator
TOPI
- #17091 - Add dense schedule for fp16 and fp32 using gemm
Misc
- #17190 - [Cython][FFI] Fix crash when call del operator for handle
- #17170 - Pass to eliminate redundant branch and overcompute
-
#17185 - Remove and replace deprecated
distutils.util.strtobool()
-
#17188 - Add
packaging
topython/gen_requirements.py
- #17181 - [FFI] Add python signal handler for ctypes FFI
-
#17173 - Use
packaging.version.parse
instead ofdistutils.version.LooseVersion
- #17174 - [TVMJS] Check DataType.NUMPY2STR when saving array
- #17168 - [Meta Schedule][XGBoost] enable custom callback func test with xgboost>=1.6.0
- #17156 - [release][Dont Squash] Update version to 0.17.0 and 0.18.0.dev on main branch
- #17156 - [release][Dont Squash] Update version to 0.17.0 and 0.18.0.dev on main branch
- #17135 - [QoL][IR] Provide default constructor for NameSupply/GlobalVarSupply
- #17125 - [Utils] Define line-length for “ruff format”
- #17152 - GraphExecutor: Fix wild pointer assign when input and output are reshape
- #17150 - [WebGPU] Fall back to 256MB for maxBufferSize if needed
- #17128 - [Compute-inline] Prefer T.where for reverse compute-inlined block with predicate
-
#16976 - [WebGPU] Implement
tir.dp4a
with WGSL built-in functiondot4I8Packed
-
#17124 - [WebGPU] Add
tir.dp4a