TVM Monthly - September 2024

ysh329 · October 1, 2024, 3:09am

As discussed by the TVM PMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

RFCs

Android Neural Networks API (NNAPI) is a graph-level neural network inference API provided by the Android runtime. Prior to this RFC, TVM on Android mobile devices mainly relies on OpenCL for GPU acceleration. This RFC aims to add a new codegen and a runtime via the BYOC framework, which enables execution on custom accelerators from SoC vendors on mobile devices.

#109 - [RFC] NNAPI Integration via BYOC

We continue to improve Relax, TIR, Frontend and other Runtimes .

BYOC

#17385 - [NNAPI] Add NNAPI backend for BYOC

BugFix

#17419 - [FFI]Grab GIL when check env signals
#17383 - [ONNX] Skip constant If node generated by PyTorch
#17360 - [FIX] fix bug when normalize iter with different lower bounds
#17148 - [Relax] Preserve existing DataflowBlock in ConvertToDataflow
#17345 - [Fix][Relax] Add the missing tree-attn func arg for KV cache creation
#17073 - [Relax]FCallPacked not checked in CodegenVMTIR
#17315 - [MSC]Bugfix for strided_slice op
#17335 - [Relax][PyTorch][Fix] use_convert_torch_tensor_to_relax() where possible
#17330 - [Relax][PyTorch]Update layer_norm converter to support immutable_list for normalized_shape
#17324 - [Fix] Remove tvm. prefix from image name when ./docker/build.sh

CI

#17410 - Upgrade unity image tag to 20240917-153130-9f281758
#17409 - [Windows] Workaround for error in FindLLVM
#17397 - Update image tag to 20240917-153130-9f281758
#17338 - Upgrade PyTorch to 2.4.1
#17337 - Disable NNPACK build and fix error on Android SDK installaion
#17355 - Upgrade github upload-artifact action
#17334 - [Hexagon] Forward gtest tests into pytest as separate tests

Disco

#17398 - Enable float8 data type in disco

Dlight

#17430 - [GPU] Improve matmul schedule for adreno
#17363 - Fix Matmul rule for Conv3D

Docs

#17402 - [Doc] Update Architecture Overview
#17382 - More clarity on security model of RPC server
#17380 - [Doc] Relax Deep Dive
#17377 - Update document to include security model of RPC server
#17378 - Link to project-specific security page
#17352 - TVM pip Installation fix
#17343 - Minor fix typo in developer howto guide
#17328 - [Doc] Deep Dive TensorIR
#17327 - [Doc] How to Optimize a Language Model
#17320 - [Doc] Customize Optimization
#17319 - [Doc] Fix doc build error in e2e_opt_model.py

LLVM

#17403 - [Fix]Fix getHostCPUFeatures LLVM version cutoff
#17347 - [RUNTIME] Fix RISC-V CodeModel propagation to ORCJIT runtime executor

Relax

#17428 - Introduce static shape tuning pipeline
#17426 - [PyTorch] Support neural network ops for ExportedProgram importer
#17424 - [PyTorch] Support binary, statistical and search ops for ExportedProgram importer
#17421 - [PyTorch] Support more unary ops for ExportedProgram importer
#17396 - [PyTorch] Add support for torch.export.ExportedProgram in Relax PyTorch Frontend
#17401 - [KVCache] Attention func accepting over-padded qkv and output NDArray
#17379 - [PyTorch] Fix output shape of torch.nn.functional.scaled_dot_product_attention
#17331 - Validate StructInfo annotations in well-formed check
#17376 - [PyTorch] Cleanup Tensor Manipulation and Creation op converters
#17372 - [PyTorch] Cleanup Statistical, Search and DataType op converters
#17368 - [Transform] Add SelectNode handling in SymbolicMatcher
#17369 - [PyTorch] Cleanup Neural Network op converters
#17353 - Fix BYOC removing existing ext mods
#17366 - [PyTorch] Cleanup binary op converters
#17359 - Add new NN allgather operator
#17356 - [PyTorch] Cleanup unary op converters
#17362 - [KV Cache] Refactor _attention_sequence_prefill function to …
#17332 - Validate StructInfo of variable bindings
#17350 - [Frontend][Onnx] fix params name bug in onnx frontend
#17354 - Fix inline source module cause path too long error
#17213 - Refactor RealizeVDevice to remove in-place mutation
#17253 - [Transform] Handle tuple return in RemoveUnusedOutputs
#17285 - Require correct input/output shapes R.call_tir
#17202 - Update GlobalVar name in AttachGlobalSymbol
#17342 - [PyTorch] Add support for torch.ops.aten.sym_size.int
#17300 - [PyTorch] Add support for torchvision.ops.stochastic_depth
#17218 - Allow dynamic shape argument to R.reshape
#17326 - [KVCache] Add tree attention with paged cache support
#17325 - [PyTorch] Add support for torch.nn.functional.conv*
#17314 - [Transform] Compose preproc functions in LiftTransformParams

Relay

#17339 - [qnn]: Fix qnn.avg_pool2d layout inference

Runtime

#17407 - Add property Module.is_device_module

TIR

#17411 - [NarrowDataType] Bufferload’s index should not inherit bits constraint of value

TVMScript

#17395 - [TIR, TVMScript] Add TIR - Triton integration
#17131 - [Relax] Allow return statement in DataflowBlock
#17373 - Avoid segfault from invalid TVMScript

cuda & cutlass & tensorrt

#17408 - [CUTLASS] Add FP8 gemm kernels

web

#17420 - Allow deprecated API requestAdapterInfo with any cast
#17404 - [WASM] Implement concat embeddings

Misc

#17422 - [CMake] Add NCCL/RCCL header directory to include path
#17405 - [TVMjs] Modify web package description
#17400 - [3rdparty] Bump FlashInfer for tmp workspace reduction
#17394 - [MSC] Support concat with constant inputs
#17351 - [MSC][Refactor] Support dynamic shape
#17371 - [WEBGPU] Update runtime to remove deprecated API
#17361 - [IR] Expose ReplaceGlobalVars utility in the Python API
#17358 - Update tvmc_command_line_driver.py, modify the sentence, remove the duplicate “as”
#17344 - [MSC] Reconstruct tensorrt module
#17297 - [Apps] Remove mxnet dependency from /apps/android_camera/models
#17299 - [Apps] Remove mxnet dependency from /apps/ios_rpc
#17293 - [Rust] Remove mxnet dependency and re-enable rust example
#17321 - [Target] Refine equality check on TargetKind instances