Note: This montly report contains main branch only.
As discussed by the TVM PMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.
Feedback and suggestions are welcomed so that we can further improve these updates.
Community
We continue to improve Frontend and other runtimes.
Pull Requests
The below is high-level summary of the PRs closed in the last month grouped by area.
Adreno
- #15830 - Minor changes for Adreno docs and help scripts
- #15671 - [VM]Fix using buffers for weights in VM
Arith
- #15665 - Fix detect non-divisible iteration form like (x % 255) // 16
- #15677 - [BugFix]IterMapRewriter abort rewriting once failure
BugFix
- #15773 - [CPP] Fix cpp deploy bug
- #15778 - [Hotfix] Fix Windows Pipe
- #15748 - Move symbols that are relevant to the runtime from libtvm to…
- #15752 - [Relay]fix the wrong calculate logic of operator flip in PyTorch frontend
- #15715 - [Relay]Fix the wrong implementation about operator Threshold in oneflow
-
#15711 - [Strategy] Fix
arm_cpu
int8 conv2d strategy for dotprod and i8mm targets - #15717 - [Relay]fix the wrong implementation of Softplus in OneFlow
CI
- #15668 - Allow Limit CPUs in Docker
Docker
- #15819 - Install oneflow from PyPi
Frontend
- #15838 - Fix unnecessary pylint errors
- #15802 - [SkipCI][Hotfix][TFLite] Disable test of quantized floor mod
- #15790 - [TFLite]Support quantized LESS_EQUAL
- #15775 - [TFLite]Support quantized GREATER_EQUAL
- #15769 - [TFLite]Support quantized NOT_EQUAL
- #15768 - [TFLite]Support quantized div
- #15746 - [TFLite]Support quantized LESS
- #15733 - [TFLite]Support quantized floor_mod
- #15724 - [TFLite]Support quantized floor_div
Hexagon
MetaSchedule
- #15792 - Allow generating uint random data
Metal
- #15756 - [Unittest]Add minimal metal functionality test to CI
- #15749 - [UnitTest]Parametrize allreduce GPU tests
OpenCL & CLML
- #15745 - [OpenCL] Don’t initialize OpenCL runtime on host
ROCm
- #15777 - [Codegen]Mismatched Dtype of Workgroup/Workitem
Relay
- #15648 - [TOPI] Remove input padding for arm_cpu conv2d int8 native schedule in Legalize pass
- #15386 - Fix an adaptive_max_pool1d operator conversion bug
Runtime
-
#15693 - Make
CSourceModule
andStaticLibraryModule
Binary Serializable
TIR
-
#15816 - Revert "[TensorIR][Visitor] Visit buffer members in
match_buffer
's in block visitor functions (#15153) - #15763 - Do not drop 4th argument to tir.max
- #15646 - Output DeclBuffer in LowerThreadAllreduce
TOPI
- #15685 - [Target]Use LLVM for x86 CPU feature lookup
-
#15710 - Ensure vectorization of input padding in
arm_cpu
int8 conv2d interleaved schedule
TVMC
- #15779 - enable dumping imported modules too
TVMScript
- #15824 - Preserve traceback across TVMScript parsing
- #15762 - Use environment variable TVM_BLACK_FORMAT for .show()
-
#15706 - Disable
black_format
by default -
#15705 - [FIX] Disable
show_object_address
in printing by default
microTVM
- #15667 - Check the output of microNPU demos in CI
Misc
- #15818 - [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance
- #15761 - [Target] LLVM helper functions for any target info
- #15672 - [IR] Implemented Variant<…> container
- #15714 - [Target][Device] Auto detect target and create device from str in torch style
- #15723 - fix _convert_simple_rnn
- #15725 - Revert “[CodeGenC] Handle GlobalVar callee as internal function call”
- #15684 - [Hopper TMA] Add intrinsic to create barriers for synchronization
- #15683 - Fix a bug caused by PyTorch instance_norm when the input shape is [1,1,1,2]
- #15596 - [FFI] Propagate Python errors across FFI boundaries
- #15666 - [Module] Implement custom imported modules serialization
- #15656 - [Hopper TMA] Add CUDA codegen support for bulk asynchronous copy
- #15664 - [IR] Use structural equal for Range equality
- #15649 - Add output_data_sec section in corstone300.ld