TVM Monthly - February 2021

As discussed by the TVM PPMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

Community

During the Feburary of 2021 we welcomed many new contributors to the project. Importantly we welcomed @d-smirnov as reviewer. Thanks to everyone for their hardwork and contributions!

This forum got 13.5k pageviews, 2.4k user visits in the last month.

Pull Requests

The below is high-level summary of the PRs closed in the last month grouped by area.

Fixes

  • Another attempt to fix flaky segfaults from torch detection test #7371
  • fix duplicated symbol bug in external codegen #7383
  • Fix tokenizing inf #7370
  • @hzfan -> reviewer #7360
  • Refactor Dynamic to Static #7368
  • Fix missing round(), floor(), ceil() for target C lowering #7382
  • Improve error messages when array/map types do not match in function calls #7330
  • Added check for dynamic range quantization #7114
  • Generate requirements.txt from Python spec #7289
  • Replace timestamp with counter #7389
  • Support negative pad values #7375
  • Add cuda tags and unit test #7410
  • Improve op_type missing message #7384
  • Check for dynamic rank before accessing value in Dynamic Reshape #7414
  • Minor refactor for C++ memory alloc #7413
  • Fix AutoScheduler for anaconda python #7387
  • Fix compilation when __ARM_FEATURE_FP16_SCALAR_ARITHMETIC #7386
  • Jenkinsfile changes for #7333. #7388
  • Add VMWare to Reference VM instructions #7221
  • Generate JUnitXML from pytest #7407
  • Only compile runtime files once #7417
  • Only set Clang flags for C++ files #7424
  • TRT Dynamic Reshape Fix #7412
  • Simplify full broadcast #7423
  • Fix iter_affine_map with non-const extent #7437
  • Stop running some python testsuites twice #7430
  • Replace type punning with memcpy. #7415
  • Fix Bug in Bilinear Interpolation and Add Deform Conv to PT FrontEnd #7397
  • Make the TVM targets list available in Python #7427
  • Fix double compile of runtime sources for TRT, ACL #7436
  • Fix SelectNode TIRTextPrinter bracket mismatch #7405
  • Add minimal ROCm docker #7422
  • Use standalone_crt build tree for all µTVM builds #7333
  • Move param bind to OptimizeModule #7451
  • update stm32mp1 arm_cpu target configuration #7443
  • Print .elf statistics for a model runtime built with Zephyr #7449
  • Add IdentityN operator for TF Frontend #7452
  • docker/bash.sh: lookup docker image in Jenkinsfile #7453
  • Add Thrust support #7458
  • Report JUnit test results for all TVM Python tests #7450
  • SparseFillEmptyRows Op #7442
  • Fix Bug Which Cause Negative Left Shift Op #7433
  • Add support for default Ethos-N78 configuration. #6982
  • Set TOpPattern=kOpaque for scatter_nd #7464
  • Allow manual shape specification in tvmc #7366
  • Make spelling of "axes" consistent #7460
  • Get tvmc version from tvm #7478
  • make test_runtime_rpc use pytest.main() #7482
  • Specialize MutateArray in StmtFunctor. #7486
  • Add composite target passes for compilation and tuning #7304
  • Enforce -libs=thrust to allow thrust offload #7468
  • Add target host field for target specification #7462
  • Fixed minor misspelling #7499
  • Fix stack overflow when partially-init Node raises exception. #7481
  • @d-smirnov -> reviewer #7510
  • rename composite target "acl" #7508
  • Support creating Bool constants in the pattern_utils #7507
  • Update tags with minor fix #7448
  • Remove incubating from docs #7525
  • Enable proper error message in python package #7521
  • Introduce module_loader to AutoTVM. #7337
  • Many fixes to get unit tests passing on Windows. #7431
  • SparseReshape Op #7477
  • Add create_local_debug_runtime to micro exports #7528
  • Don't run non-tvm_op GraphRuntime nodes in Debug Runtime over RPC. #7512
  • Add test_forward_index_put to main #7542
  • add missing equal sign #7531
  • Fix typo in relay.vm.Executable #7543
  • fuse constant padding into conv kernels #7515
  • Fix: cuda codegen vectorize cast #7561
  • Create C-runtime-style metadata module for llvm builds #7398
  • Profiling TVM compiler passes #7500
  • Add TIR While node #7425
  • introduce Block and BlockRealize #7553
  • Support conds depend on outer loop vars inside tensorize scope #7497
  • Add SPIR-V lowering for While node #7574
  • compile engine dump tir and shape funcs #7552
  • Fix a flaky test #7580
  • Fix: install script regarding get-pip.py during docker build #7579
  • Add support for 20.11 Ethos-N driver stack release #7506
  • Fixes for using Python APIs from Rust. #7085
  • Add segment sum Op to relay and 7 corresponding TF Ops , fix scatter_add dynamic bug #7562
  • Support Bool buffer argument #7591
  • Fix for dynamic batch size conv2d nhwc #7598
  • Guarantee data input is the first argument #7592
  • Support negative axis for gather #7600
  • Support passing 64 bit scalar #7572
  • Fix autotuning, broken in #7337 #7566
  • Sparse dense tuning support with custom sketch rule #7313
  • BF16 support #7014
  • Fix bug in AutoInlineElemWise and implement AutoInlineBroadcast #7602
  • Add logging to diagnose flaky ci-qemu test #7610
  • Move SimplifyConvPad to a new pass and don't enable it by default #7603
  • Fix clang12 warnings #7593

Relay

  • Iterative A-normal Traversals #7374
  • Refactor where importer to support dynamic shapes. #7394
  • Dense with weight transform #7404
  • Fix missing return in scatter_nd cuda strategy #7447
  • Add max mode to ROI align #7440
  • Crash in match_exhaustion.cc when given an empty tuple pattern or constructor with no args #7459
  • Support roi_align NHWC layout #7463
  • Fix off-by-one error in BiasAddRel, use new reporting #7467
  • Optimize relay parser to restore calls attrs #7347
  • Fix GEMM converter when C is not a parameter. #7509
  • Enforce static dim for non-concat axis if one or more tensors have static dim #7487
  • Fix foldconstant involving dropout #7550
  • Modify some passes to not stack overflow on many lets. #7558
  • BiasAddRel does not check for a negative index being out of bounds #7554
  • Fix Bug Which Cause Negative Left Shift Op #7432
  • Avoid stack overflow when using PostOrderRewrite #7588
  • add ShapeFunc for tanh #6898
  • Fix relay op strategy for cuda dense int8 #7586
  • add ShapeFunc for one_hot op #7490

Topi

  • Add einsum operator #6370
  • Fix cuda nms handling of additional per box features #7483
  • Allow topi resize to accept more options #7532
  • disable test_shift with i8 datatype #7597

Autoscheduler

  • Fail to register ComputeDAG when deserializing tasks #7395
  • Support early_stopping per task #7377
  • Add sampling to dispatcher #7376
  • Fix distill record #7439
  • Fix the type inference for conv3d #7475
  • Fix the type inference for conv2d #7501
  • Autoscheduler layout rewrite pass to VM #7516
  • Querying and sampling in task extraction #7571
  • Correctly resume status #7614

Ci

  • Temporary increase ci timeout #7403
  • Add back the tests after timeout adjusted #7408
  • Move ci-cpu to use llvm-11 #7541
  • Update CI Vitis AI PyXIR version #7575
  • Bump ARM image version #7584

Onnx

  • Add CumSum operator to ONNX frontend #7391
  • Make the ONNX Importer More Static #7429
  • use checked_type instead of type_annotation #7522
  • fix datatype on Reciprocal op #7519

Byoc

  • Fix small bug preventing TRT runtime compilation for versions < 6 #7372
  • Refactor Verilator runtime #7406
  • Fix issue in Vitis AI codegen out tensor names matching & update docs and docker #7350
  • Make TRT runtime robust to empty or weird subgraphs #7581
  • Fix groups cannot divide output channel count error for deconv when groups>1 #7595

Frontend

  • Make keras reshape less restrictive #7446
  • Add support for MXNet GroupNorm #7409
  • get input tensor information from graph #7400
  • Support explicit_paddings for TF 2.x #7445
  • Make onnx gemm tensor C optional #7489
  • Support range like axis in tf.raw_ops.All for TF 2.x #7502
  • Support CombinedNonMaxSuppression #7520
  • TF V2 sparse.todense() test added #7473
  • Add unique operator #7441
  • Fix default value for is_ascend in topk #7568

Bugfix

  • debug operator–() in include/tvm/node/container.h #7461
  • Properly return and unflatten outputs from GraphExecutor #7604

Runtime

  • Fast path for single thread run to allow app level threading #7454
  • Special Memory Scope Support #7488
  • Move Map into runtime #7570
  • Add device specific timers #7472
  • Unify load params interface #7559
  • Add Object::unique() #7615

Torch

  • Add index_put operator #7465
  • Pool ops, convert strides and pool_size to int #7517
  • Avoid adding unnecessary slicing #7479
  • Add narrow operator #7535
  • Simplify contiguous #7544
  • Fix converting torch slice op with dynamic slice length #7549
  • Add linear operator support #7569
  • Support quantized mobilenet v3 from torch 1.8 #7606

Contributors Who Reviewed Pull Requests

Note: The format is name (number of activities) Disclaimer: number of activities do not directly correspond to the community’s view about the significance of contributions.

tqchen (83), junrushao1994 (40), comaniac (34), zhiics (31), mbrookhart (23), masahi (22), MarisaKirisame (20), tkonolige (19), tmoreau89 (18), leandron (14), icemelon9 (10), wweic (9), FrozenGene (9), anijain2305 (8), jroesch (8), jwfromm (8), weberlo (8), merrymercy (7), areusch (7), mbaret (7), ZihengJiang (6), slyubomirsky (6), trevor-m (6), joshpoll (6), manupa-arm (6), binarybana (6), yzhliu (5), altanh (5), adelbertc (5), u99127 (4), robo-corg (4), vinx13 (3), eqy (3), giuseros (3), codeislife99 (3), rkimball (3), electriclilies (3), siju-samuel (2), kevinthesun (2), Laurawly (2), apivovarov (2), liangfu (2), ANSHUMAN87 (2), zhreshold (2), csullivan (2), ehsanmok (2), mdw-octoml (2), mwillsey (2), leonwanghui (2), srkreddy1238 (1), kazum (1), nhynes (1), vegaluisjose (1), hlu1 (1), t-vi (1), ajtulloch (1), lhutton1 (1), jcf94 (1), sxjscience (1), alex-weaver (1), xqdan (1), gussmith23 (1), grwlf (1), ymwangg (1), gromero (1), zxybazh (1), monklof (1), Leo-arm (1)

Contributors Whose Pull Requests were Updated

Note: The format is name (number of activities)

jroesch (66), areusch (14), masahi (12), comaniac (10), codeislife99 (10), mbrookhart (9), tqchen (7), trevor-m (7), tkonolige (5), monklof (5), apivovarov (4), electriclilies (4), zxybazh (3), jwfromm (2), vegaluisjose (2), slyubomirsky (2), leandron (2), ANSHUMAN87 (2), cbalint13 (2), rkimball (2), altanh (2), hypercubestart (2), ymwangg (2), jtuyls (2), NicolaLancellotti (2), mshr-h (2), merrymercy (1), ZihengJiang (1), MarisaKirisame (1), yzhliu (1), anijain2305 (1), vinx13 (1), tmoreau89 (1), kevinthesun (1), FrozenGene (1), yongwww (1), d-smirnov (1), u99127 (1), windclarion (1), gussmith23 (1), csullivan (1), zxy844288792 (1), tristan-arm (1), xutianming (1), gromero (1), yinghai (1), euntaik (1), Beya2019 (1), Johnson9009 (1), alexwong (1), domin1985 (1), echuraev (1), lsy643 (1), dlexplorer (1), CircleSpin (1), grant-arm (1), hanke580 (1), MatthewARM (1), vinceab (1), Wheest (1)