TVM Monthly - September 2023

Note: This montly report contains main branch only.

As discussed by the TVM PMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

Community

We continue to improve Frontend and other runtimes.

Pull Requests

The below is high-level summary of the PRs closed in the last month grouped by area.

Adreno

  • #15830 - Minor changes for Adreno docs and help scripts
  • #15671 - [VM]Fix using buffers for weights in VM

Arith

  • #15665 - Fix detect non-divisible iteration form like (x % 255) // 16
  • #15677 - [BugFix]IterMapRewriter abort rewriting once failure

BugFix

  • #15773 - [CPP] Fix cpp deploy bug
  • #15778 - [Hotfix] Fix Windows Pipe
  • #15748 - Move symbols that are relevant to the runtime from libtvm to…
  • #15752 - [Relay]fix the wrong calculate logic of operator flip in PyTorch frontend
  • #15715 - [Relay]Fix the wrong implementation about operator Threshold in oneflow
  • #15711 - [Strategy] Fix arm_cpu int8 conv2d strategy for dotprod and i8mm targets
  • #15717 - [Relay]fix the wrong implementation of Softplus in OneFlow

CI

  • #15668 - Allow Limit CPUs in Docker

Docker

  • #15819 - Install oneflow from PyPi

Frontend

  • #15838 - Fix unnecessary pylint errors
  • #15802 - [SkipCI][Hotfix][TFLite] Disable test of quantized floor mod
  • #15790 - [TFLite]Support quantized LESS_EQUAL
  • #15775 - [TFLite]Support quantized GREATER_EQUAL
  • #15769 - [TFLite]Support quantized NOT_EQUAL
  • #15768 - [TFLite]Support quantized div
  • #15746 - [TFLite]Support quantized LESS
  • #15733 - [TFLite]Support quantized floor_mod
  • #15724 - [TFLite]Support quantized floor_div

Hexagon

  • #15788 - Properly handle RPC server shutdown
  • #15599 - F2qi avgpool bug fix

MetaSchedule

  • #15792 - Allow generating uint random data

Metal

  • #15756 - [Unittest]Add minimal metal functionality test to CI
  • #15749 - [UnitTest]Parametrize allreduce GPU tests

OpenCL & CLML

  • #15745 - [OpenCL] Don’t initialize OpenCL runtime on host

ROCm

  • #15777 - [Codegen]Mismatched Dtype of Workgroup/Workitem

Relay

  • #15648 - [TOPI] Remove input padding for arm_cpu conv2d int8 native schedule in Legalize pass
  • #15386 - Fix an adaptive_max_pool1d operator conversion bug

Runtime

  • #15693 - Make CSourceModule and StaticLibraryModule Binary Serializable

TIR

  • #15816 - Revert "[TensorIR][Visitor] Visit buffer members in match_buffer's in block visitor functions (#15153)
  • #15763 - Do not drop 4th argument to tir.max
  • #15646 - Output DeclBuffer in LowerThreadAllreduce

TOPI

  • #15685 - [Target]Use LLVM for x86 CPU feature lookup
  • #15710 - Ensure vectorization of input padding in arm_cpu int8 conv2d interleaved schedule

TVMC

  • #15779 - enable dumping imported modules too

TVMScript

  • #15824 - Preserve traceback across TVMScript parsing
  • #15762 - Use environment variable TVM_BLACK_FORMAT for .show()
  • #15706 - Disable black_format by default
  • #15705 - [FIX] Disable show_object_address in printing by default

microTVM

  • #15667 - Check the output of microNPU demos in CI

Misc

  • #15818 - [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance
  • #15761 - [Target] LLVM helper functions for any target info
  • #15672 - [IR] Implemented Variant<…> container
  • #15714 - [Target][Device] Auto detect target and create device from str in torch style
  • #15723 - fix _convert_simple_rnn
  • #15725 - Revert “[CodeGenC] Handle GlobalVar callee as internal function call”
  • #15684 - [Hopper TMA] Add intrinsic to create barriers for synchronization
  • #15683 - Fix a bug caused by PyTorch instance_norm when the input shape is [1,1,1,2]
  • #15596 - [FFI] Propagate Python errors across FFI boundaries
  • #15666 - [Module] Implement custom imported modules serialization
  • #15656 - [Hopper TMA] Add CUDA codegen support for bulk asynchronous copy
  • #15664 - [IR] Use structural equal for Range equality
  • #15649 - Add output_data_sec section in corstone300.ld
4 Likes