As discussed by the TVM PMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.
Feedback and suggestions are welcomed so that we can further improve these updates.
RFCs
Android Neural Networks API (NNAPI) is a graph-level neural network inference API provided by the Android runtime. Prior to this RFC, TVM on Android mobile devices mainly relies on OpenCL for GPU acceleration. This RFC aims to add a new codegen and a runtime via the BYOC framework, which enables execution on custom accelerators from SoC vendors on mobile devices.
- #109 - [RFC] NNAPI Integration via BYOC
We continue to improve Relax, TIR, Frontend and other Runtimes .
BYOC
- #17385 - [NNAPI] Add NNAPI backend for BYOC
BugFix
- #17419 - [FFI]Grab GIL when check env signals
- #17383 - [ONNX] Skip constant If node generated by PyTorch
- #17360 - [FIX] fix bug when normalize iter with different lower bounds
- #17148 - [Relax] Preserve existing DataflowBlock in ConvertToDataflow
- #17345 - [Fix][Relax] Add the missing tree-attn func arg for KV cache creation
- #17073 - [Relax]FCallPacked not checked in CodegenVMTIR
- #17315 - [MSC]Bugfix for strided_slice op
-
#17335 - [Relax][PyTorch][Fix] use
_convert_torch_tensor_to_relax()
where possible -
#17330 - [Relax][PyTorch]Update
layer_norm
converter to supportimmutable_list
fornormalized_shape
-
#17324 - [Fix] Remove
tvm.
prefix from image name when./docker/build.sh
CI
-
#17410 - Upgrade unity image tag to
20240917-153130-9f281758
- #17409 - [Windows] Workaround for error in FindLLVM
- #17397 - Update image tag to 20240917-153130-9f281758
- #17338 - Upgrade PyTorch to 2.4.1
- #17337 - Disable NNPACK build and fix error on Android SDK installaion
- #17355 - Upgrade github upload-artifact action
- #17334 - [Hexagon] Forward gtest tests into pytest as separate tests
Disco
- #17398 - Enable float8 data type in disco
Dlight
Docs
- #17402 - [Doc] Update Architecture Overview
- #17382 - More clarity on security model of RPC server
- #17380 - [Doc] Relax Deep Dive
- #17377 - Update document to include security model of RPC server
- #17378 - Link to project-specific security page
- #17352 - TVM pip Installation fix
- #17343 - Minor fix typo in developer howto guide
- #17328 - [Doc] Deep Dive TensorIR
- #17327 - [Doc] How to Optimize a Language Model
- #17320 - [Doc] Customize Optimization
- #17319 - [Doc] Fix doc build error in e2e_opt_model.py
LLVM
- #17403 - [Fix]Fix getHostCPUFeatures LLVM version cutoff
- #17347 - [RUNTIME] Fix RISC-V CodeModel propagation to ORCJIT runtime executor
Relax
- #17428 - Introduce static shape tuning pipeline
- #17426 - [PyTorch] Support neural network ops for ExportedProgram importer
- #17424 - [PyTorch] Support binary, statistical and search ops for ExportedProgram importer
- #17421 - [PyTorch] Support more unary ops for ExportedProgram importer
-
#17396 - [PyTorch] Add support for
torch.export.ExportedProgram
in Relax PyTorch Frontend - #17401 - [KVCache] Attention func accepting over-padded qkv and output NDArray
-
#17379 - [PyTorch] Fix output shape of
torch.nn.functional.scaled_dot_product_attention
- #17331 - Validate StructInfo annotations in well-formed check
- #17376 - [PyTorch] Cleanup Tensor Manipulation and Creation op converters
- #17372 - [PyTorch] Cleanup Statistical, Search and DataType op converters
- #17368 - [Transform] Add SelectNode handling in SymbolicMatcher
- #17369 - [PyTorch] Cleanup Neural Network op converters
- #17353 - Fix BYOC removing existing ext mods
- #17366 - [PyTorch] Cleanup binary op converters
- #17359 - Add new NN allgather operator
- #17356 - [PyTorch] Cleanup unary op converters
-
#17362 - [KV Cache] Refactor
_attention_sequence_prefill
function to … - #17332 - Validate StructInfo of variable bindings
- #17350 - [Frontend][Onnx] fix params name bug in onnx frontend
- #17354 - Fix inline source module cause path too long error
- #17213 - Refactor RealizeVDevice to remove in-place mutation
- #17253 - [Transform] Handle tuple return in RemoveUnusedOutputs
-
#17285 - Require correct input/output shapes
R.call_tir
- #17202 - Update GlobalVar name in AttachGlobalSymbol
-
#17342 - [PyTorch] Add support for
torch.ops.aten.sym_size.int
- #17300 - [PyTorch] Add support for torchvision.ops.stochastic_depth
- #17218 - Allow dynamic shape argument to R.reshape
- #17326 - [KVCache] Add tree attention with paged cache support
-
#17325 - [PyTorch] Add support for
torch.nn.functional.conv*
- #17314 - [Transform] Compose preproc functions in LiftTransformParams
Relay
- #17339 - [qnn]: Fix qnn.avg_pool2d layout inference
Runtime
- #17407 - Add property Module.is_device_module
TIR
- #17411 - [NarrowDataType] Bufferload’s index should not inherit bits constraint of value
TVMScript
- #17395 - [TIR, TVMScript] Add TIR - Triton integration
- #17131 - [Relax] Allow return statement in DataflowBlock
- #17373 - Avoid segfault from invalid TVMScript
cuda & cutlass & tensorrt
- #17408 - [CUTLASS] Add FP8 gemm kernels
web
- #17420 - Allow deprecated API requestAdapterInfo with any cast
- #17404 - [WASM] Implement concat embeddings
Misc
- #17422 - [CMake] Add NCCL/RCCL header directory to include path
- #17405 - [TVMjs] Modify web package description
- #17400 - [3rdparty] Bump FlashInfer for tmp workspace reduction
- #17394 - [MSC] Support concat with constant inputs
- #17351 - [MSC][Refactor] Support dynamic shape
- #17371 - [WEBGPU] Update runtime to remove deprecated API
- #17361 - [IR] Expose ReplaceGlobalVars utility in the Python API
- #17358 - Update tvmc_command_line_driver.py, modify the sentence, remove the duplicate “as”
- #17344 - [MSC] Reconstruct tensorrt module
- #17297 - [Apps] Remove mxnet dependency from /apps/android_camera/models
- #17299 - [Apps] Remove mxnet dependency from /apps/ios_rpc
- #17293 - [Rust] Remove mxnet dependency and re-enable rust example
- #17321 - [Target] Refine equality check on TargetKind instances