Note: This montly report contains main branch only.
As discussed by the TVM PMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.
Feedback and suggestions are welcomed so that we can further improve these updates.
Community
We continue to improve TensorIR, and other runtimes. A new RPC about Clarify Community Strategy Decision Process was proposed.
Pull Requests
The below is high-level summary of the PRs closed in the last month grouped by area.
ArmComputeLibrary
- #15600 - [ACL] Update Compute Library to v23.05.1
CMSIS-NN
- #15407 - Support for Softmax Int16 operator
ROCm
cuda & cutlass & tensorrt
- #15573 - [CUTLASS][Cherry-pick] Introduce several features of cutlass profiler
Frontend
- #15472 - [Relay][TFLite] Fix in qnn.conv2d when parameter groups not equal to 1
MetaSchedule
-
#15574 - Fix metaschedule flop estimation for non-integer loop dimensions
-
#15532 - Enable subprocess to stdout for DEBUG level
Arith
-
#15628 - Added simplification rule for multiple equality compares
-
#15558 - Fix detect linear equation with uint var
-
#14690 - Add tvm::arith::PresburgerSetNode to work with Presburger Set in MLIR
-
#15555 - Fix handling of overlapping predicates
-
#15471 - Enhance Canonical Simplify for LE
Relay
-
#15533 - Disable exception for ADT in mixed precision pass
-
#15506 - [Strategy] Use x86 pool schedules for arm_cpu
-
#15470 - [Strategy] Use x86 dense schedules for arm_cpu
-
#15392 - add redirecting operation to dataflow pattern graph
-
#15468 - [Strategy] Fix
arm_cpu
int8 conv2d schedule selection for 32-bit targets -
#15461 - Stop ToMixedPrecision when constant is out of dtype range
Runtime
-
#15637 - [Backport]Fix ICE from Clang
-
#15244 - Serialization/Deserialization of runtime module
-
#15630 - Utils to Stringify Device
-
#15623 - Expose ModuleGetFunction as PackedFunc
-
#15595 - Enhance PackedFunc Metaprogramming with
PackArgs
-
#15543 - [Minor] Suppress verbose logging in Metal device API
TOPI
- #15513 - check empty array of x86 injective’s iters
TIR
-
#15579 - Optionally output the address as part of variable names
-
#15564 - Use triple-quoted python strings for metadata
-
#15547 - Create loop var with min_val dtype in for frame
-
#15492 - Allow use of Python builtins in script
-
#15442 - Support starred indices in for-loop
-
#15493 - Output DeclBuffer in SplitHostDevice
-
#15517 - Shuffle in PointerValueTypeRewrite for scalar reads
-
#15263 - Output DeclBuffer in MakePackedAPI
-
#15465 - [TIR, Schedule] Fix decompose reduction with thread binding loops
BugFix
-
#15629 - [VTA] tvm.tir.Call has no name attribute
-
#15602 - [ONNX]Support If body with free variable from graph input
-
#15584 - [Relay][Strategy] Enable compile time transformation of weights matrix for arm_cpu NHWC quantized conv2d
-
#15542 - [Fix] Fix the typo in compile flag
-
#15484 - [TOPI] Fix a bug in arm_cpu int8 conv2d i8mm schedule
-
#15473 - [Relay] Fix some bugs of dominator pattern
-
#15480 - [CUTLASS] CUTLASS path finding
-
#15478 - [TIR] ThreadSync with shared.dyn awareness
CI
-
#15568 - [Testing] Allow Capitalized name in CompareBeforeAfter
-
#15519 - [TEST] Run tests/python/relay/aot tests in ci-cortexm
-
#15485 - Remove cython version pin
Docs
Misc
-
#15639 - Do not link LLVM libraries into cpptest binary
-
#15631 - [RPC] Enhance RPC Protocol to support TVM Object
-
#15624 - [CMake] Add RCCL to TVM and TVM Runtime
-
#15616 - [Hopper TMA] CUDA codegen for async copy with barrier synchronization
-
#15537 - [CPP_RPC] export listdir for RPC
-
#15605 - [CMake] Add NCCL to TVM and TVM Runtime
-
#15580 - Fix “to” duplicate word in python and C header file
-
#15581 - Remove duplicate load word inside .cc file
-
#15582 - Remove duplicate ‘from’ word inside python script
-
#15554 - Bump tornado from 6.1 to 6.3.3 in /apps/microtvm
-
#15552 - Bump tornado from 6.1 to 6.3.3 in /apps/microtvm/ethosu
-
#15553 - Bump tornado from 6.1 to 6.3.3 in /apps/microtvm/cmsisnn
-
#15536 - fixed typo [TypoFix]
-
#15529 - [quantize] fix bug of annotate for output of add op
-
#15535 - Fixed search task comment
-
#15530 - Remove duplicate msg word and condition inside the function doc
-
#15511 - Remove IRModule Dependency from Target
-
#15525 - Fix typo mistake and change whethe to whether
-
#15524 - Remove duplicate the word
-
#15103 - [CodeGenC] Handle GlobalVar callee as internal function call
-
#15419 - [VM][Textures] Enable OpenCL textures for VM
-
#15483 - [Script] Be more careful when generating ast.ExtSlice for Subscript
-
#15469 - [CYTHON] Make cython compatible with 3.0