TVM Monthly - Aug 2020

As discussed by the TVM PPMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

Community

The community also welcomes new committer Krzysztof Parzyszek (@kparzysz-quic ) and new reviewer Chenfan (@jcf94).

This forum got 107K pageviews, 2.8K user visits in the last month.

Features and Improvements

In the previous month, the community has made good progress on auto scheduler, codegen, operator/backend coverage, performance optimization, command line driver, refactoring, etc.

Here are a few highlights:

  • Introduction of metadata parsing and the new diagnostic error handling #6162
  • The addition of a command line driver for TVM, TVMC #6112
  • The Ethos-N BYOC integration #6222
  • The addition of various auto scheduler features, such as search policy and cost models #6310, #6269, #6270, #6190, #6187, etc
  • Improvement and optimization of Autodiff #6078
  • The addition of Hexagon codegen #6261

Pull Requests

The below is high-level summary of the PRs closed in the last month grouped by area.

Runtime

  • Support random fill #5913
  • Use new to avoid exit-time de-allocation order #6292
  • Add parallel_for support to run a loop in parallel #6275

Tir

  • Enhance VerifyGPUCode #6194
  • HoistIfThenElse added #6066
  • Hybrid Script Support for TIR #6227
  • Enforce buffer pointer var type to be consistent with dtype. #6317
  • Block scope hoisting added #6238

Target

  • Rename target_id => target_kind #6199
  • 64-bit RPi4b target #6211
  • Creating Target from JSON-like Configuration #6218
  • Add python binding to new JSON target construction #6315
  • Use target class in all codegens #6347

Codegen

  • Initial support for Hexagon codegen #6261
  • Add --runtime=c, remove micro_dev target, enable LLVM backend #6145
  • Add tvm::support::hexdump() debug utility #6154

Relay

  • Basic block normal form #6152
  • OneHot operation #6209
  • Support combine multiple dense op just into dense #6062
  • Add Dynamic Resize Op #6198
  • Dynamic full operator #6260
  • Fix node indices attribute error for tensorflow 2.3 #6288
  • Make the max number of fused ops configurable #6327
  • Implementation of the dynamic pad operator #6284
  • change device annotation from post DFS to recursive #6124
  • Dynamic upsampling relay op #6273
  • Make check stricter: disallow inserting function with free vars into module #6313
  • Support for PyTorch Non-Maximum Suppression #6314
  • Make check stricter by using Feature. Fixed multiple bugs #6326
  • Resize support for NCHW-convertible layouts #6293
  • Make AutoDiff thread through global function #6336
  • Create Interpreter for each constant subgraph #6195
  • Parser 2.0 part 2 #6162

Ansor

  • Phase 1: The base class for cost models #6187
  • Phase 2: Basic CPU Sketch Search Policy #6184
  • Phase 1: feature extraction for cost models #6190
  • Phase 1: XGBoost Cost Model #6270
  • Phase 2: Basic GPU Sketch Search Policy #6269
  • Phase 2: Evolutionary Search #6310
  • Phase 2: Update heavy operations with parallel_for #6348

BYOC

  • json_node.h should include data_type.h #6224
  • Improve installation tutorial #6170
  • Add support for dense (fully connected) layer #6254
  • Introduce the Ethos-N BYOC integration #6222
  • Enable remote device via environment variables #6279
  • Improved pooling support #6248
  • Add support for quantized convolution #6335

PyTorch

  • Add Pytorch advanced indexing #6318
  • Support index_select #6295
  • Fix cast to long #6301
  • Fix dtype handling for modules with integer parameters #6311
  • pytorch frontend support conv1d #6203
  • Add cast to double, fix flatten conversion #6357
  • Fix aten::max and aten::min conversion #6372
  • Match pytorch 1.6 googlenet pretrained model (#6201) #6212- Add unbiased variance op and corresponding support in pytorch frontend #6232

TFLite

  • Implemented PADV2 Operator for TFLite and added support for constant values in PAD. #6167
  • Implemented ONE_HOT Operator for TFLite. #6223
  • Implemented EXPAND_DIMS Operator for TFLite. #6243
  • Implemented REVERSE_V2 Operator for TFLite. #6304
  • Implemented MATRIX_SET_DIAG Operator for Relay/TOPI and TFLite Frontend. #6303
  • RESHAPE with dynamic shape arg in TFLite frontend #6208
  • Constant input attr added to fully connected operation in TFLite frontend #6228
  • Gather operation with indices as tensor expr in TFLite frontend #6168
  • Added support for tflite quantized maximum and minimum #6018

Other frontends

  • Unary ops support added in frontend #6196
  • Introduce caffe frontend for tvm #6206
  • Keras softmax and prelu fix under NHWC #6278
  • add support for MXNET numpy operators #6054
  • Refine tensorflow frontend 1.x & 2.x compatibility #6240
  • Reduceops support added to frontend #6252
  • Update precision in the ONNX strided_slice, update precision of ToScalar #6272

TOPI

  • Use auto-tuner to improve conv2d_gemm performance #6117
  • topi -> tvm/topi #6186

Build and CI

  • TVMC - a command line driver for TVM (Part 1) #6112
  • Remove topi from the CI cache #6188
  • Remove libtopi from the build #6189
  • Update build support for cross compiling apps/cpp_rpc with OpenCL #6229
  • Add docker/lint.sh, for running dockerized lint scripts locally #6333
  • Add gpuonly tests for python unittests and integration #6346

Quantization

  • Add Quantize/Dequantize Partitioning #5940

Fixes

  • Temporary disable conv2d grad strided flaky test #6183
  • Avoid unexpected throw in AttrInitEntry #6128
  • Fix alignment of note #6181
  • Added casting to hybrid script doc and fixed pass infra doc #6174
  • Fix #6205 #6207
  • Change the meaning of conv3d_transpose output_padding to match conv{1,2}d_transpose #6065
  • Fix compile warnings. #6204
  • Fix -mfloat-abi=soft compilation for ARM with OpenCL target #6150
  • Enable auto conversion String->DLDataType #6214
  • Update pass infra tutorial #6193
  • Mod operator, bug fix #6160
  • Fix compilation error with cuda 11 #6213
  • Fix port_end wrong default value 9199 to 9099 for keeping same with source code #6220
  • Std op without specified dimensions support #6226
  • Fix typo #6230
  • Verify that tensor reshape is valid. #6215
  • Fix crt building and running error #6231
  • Fix conv2d_transpose output padding #6236
  • Fix cuda half math function is undefined: hpow, htanh #6225
  • Split MKL from BLAS. #6182
  • Fix division range estimation error in simplifier #6244
  • Support overriding RPCWatchdog termination behavior on Android and other platforms #6216
  • Revert "fix cuda half math function is undefined: hpow, htanh" #6249
  • Fix newer GCC compiler warnings. #6257
  • Support _contrib_SyncBatchNorm #6245
  • Fix reduction #6250
  • Add apt repository for clang-11 and llvm-11 #6256
  • Update tutorial to new TARGET as micro_dev is no more #6262
  • Improve NHWC depthwise convolution for AArch64 #6095
  • Fix clang-format #6264
  • Trivial fix, up the rodata section for the discovery board to 512 bytes. #6259
  • Fix cuda half math function is undefined: hpow, htanh #6253
  • Add dilation in x86 NCHWc depthwise conv support #6267
  • Decrease test times by introducing testing model #6235
  • Add support for parsing the any dimension. #6277
  • Improve error messages for memory verifier and gpu memory verifier #6281
  • Update ci-cpu to the latest #6283
  • Enable CI for Ethos-N #6171
  • Reflect Compile-Time CMake Options into libtvm.so #6280
  • Add cmake options into libinfo #6286
  • Update slice to infer attributes when not graph inputs #6276
  • Support int4/int8 conv2d tensor core with HWNC layout #6121
  • Use rpc.LocalSession for simple tests #6294
  • Optimize and eliminate the Jacobian tensor for te.autodiff #6078
  • Fix flaky test #6307
  • Multiple output support, reshape, split ops added #6296
  • Fix random fail #6312
  • Fix resize test #6298
  • Fix cython FFI compact with np.int64 #6321
  • Fix relay vm optimize #6322
  • Changed TVMCTVMContext to TVMContext #6306
  • Make able to compile with MSVC #6341
  • ROCm changed name of library and removed the old one in ROCm 3.7 release. #6345
  • Add init member to ReduceNode #6138
  • Quanitze operation expanded to take const argument #6127
  • Improve Rust bindings: Map, Array, String, various IR nodes #6339
  • Compatible for ROCm before 3.7 #6359
  • Use clear name that is separate from ASF brand for cache #6360
  • Fix typo #6352
  • Fix mistyped word #6362
  • Fix Dockerfile.demo_android #6361
  • Fix typo #6338

Contributors Who Reviewed Pull Requests

Note: The format is name (number of activities) Disclaimer: number of activities do not directly correspond to the community’s view about the significance of contributions.

tqchen (93), zhiics (34), junrushao1994 (30), FrozenGene (27), comaniac (23), masahi (19), jroesch (19), leandron (15), tmoreau89 (14), mbrookhart (14), ZihengJiang (13), MarisaKirisame (12), anijain2305 (11), merrymercy (10), icemelon9 (9), u99127 (8), siju-samuel (7), jwfromm (7), jcf94 (7), vinx13 (5), kevinthesun (5), yongwww (5), electriclilies (4), liangfu (3), kparzysz-quic (3), tkonolige (3), Hzfengsy (3), Laurawly (2), eqy (2), cchung100m (2), mbaret (2), weberlo (2), ANSHUMAN87 (2), cbalint13 (2), roastduck (2), spectrometerHBH (2), tom-gall (2), cloud-mxd (2), yzhliu (1), srkreddy1238 (1), kazum (1), wweic (1), nhynes (1), apivovarov (1), lixiaoquan (1), t-vi (1), ajtulloch (1), areusch (1), sxjscience (1), xqdan (1), trevor-m (1), hypercubestart (1), szha (1), manupa-arm (1), binarybana (1), leonwanghui (1), Leo-arm (1), samskalicky (1), Shawn-Inspur (1), hanzz2007 (1)

Contributors Whose Pull Requests were Updated

Note: The format is name (number of activities)

tqchen (17), leandron (11), ZihengJiang (8), electriclilies (7), tkonolige (6), siju-samuel (5), merrymercy (5), masahi (5), junrushao1994 (5), mbrookhart (5), lhutton1 (5), d-smirnov (5), csullivan (5), jainris (5), zhiics (4), MarisaKirisame (4), areusch (4), ANSHUMAN87 (4), windclarion (4), zhanghaohit (4), cloud-mxd (4), tmoreau89 (3), jroesch (3), lixiaoquan (3), mbaret (3), jcf94 (3), hypercubestart (3), yzhliu (2), anijain2305 (2), comaniac (2), FrozenGene (2), kparzysz-quic (2), jwfromm (2), cbalint13 (2), trevor-m (2), giuseros (2), tom-gall (2), fernchen (2), xutianming (2), wjliu1998 (2), wrongtest (2), shiwenloong (2), domin1985 (2), vinx13 (1), kevinthesun (1), slyubomirsky (1), yongwww (1), cchung100m (1), abergeron (1), weberlo (1), huajsj (1), xqdan (1), maheshambule (1), eric-haibin-lin (1), spectrometerHBH (1), Fwd-IV (1), huochaitiantang (1), mwillsey (1), lsy643 (1), ghostplant (1), tkat0 (1), GaryYuyjl (1), hzfan (1), iswariyam (1), lanchongyizu (1), mplemay (1), minminsun (1), samskalicky (1), DemonGiggle (1), mvermeulen (1), hanzz2007 (1), quic-sanirudh (1), sandyhu533 (1)

2 Likes