TVM Monthly - February 2021

ziheng · March 9, 2021, 3:13pm

As discussed by the TVM PPMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

Community

During the Feburary of 2021 we welcomed many new contributors to the project. Importantly we welcomed @d-smirnov as reviewer. Thanks to everyone for their hardwork and contributions!

This forum got 13.5k pageviews, 2.4k user visits in the last month.

Pull Requests

The below is high-level summary of the PRs closed in the last month grouped by area.

Fixes

Another attempt to fix flaky segfaults from torch detection test #7371
fix duplicated symbol bug in external codegen #7383
Fix tokenizing inf #7370
@hzfan -> reviewer #7360
Refactor Dynamic to Static #7368
Fix missing round(), floor(), ceil() for target C lowering #7382
Improve error messages when array/map types do not match in function calls #7330
Added check for dynamic range quantization #7114
Generate requirements.txt from Python spec #7289
Replace timestamp with counter #7389
Support negative pad values #7375
Add cuda tags and unit test #7410
Improve op_type missing message #7384
Check for dynamic rank before accessing value in Dynamic Reshape #7414
Minor refactor for C++ memory alloc #7413
Fix AutoScheduler for anaconda python #7387
Fix compilation when __ARM_FEATURE_FP16_SCALAR_ARITHMETIC #7386
Jenkinsfile changes for #7333. #7388
Add VMWare to Reference VM instructions #7221
Generate JUnitXML from pytest #7407
Only compile runtime files once #7417
Only set Clang flags for C++ files #7424
TRT Dynamic Reshape Fix #7412
Simplify full broadcast #7423
Fix iter_affine_map with non-const extent #7437
Stop running some python testsuites twice #7430
Replace type punning with memcpy. #7415
Fix Bug in Bilinear Interpolation and Add Deform Conv to PT FrontEnd #7397
Make the TVM targets list available in Python #7427
Fix double compile of runtime sources for TRT, ACL #7436
Fix SelectNode TIRTextPrinter bracket mismatch #7405
Add minimal ROCm docker #7422
Use standalone_crt build tree for all µTVM builds #7333
Move param bind to OptimizeModule #7451
update stm32mp1 arm_cpu target configuration #7443
Print .elf statistics for a model runtime built with Zephyr #7449
Add IdentityN operator for TF Frontend #7452
docker/bash.sh: lookup docker image in Jenkinsfile #7453
Add Thrust support #7458
Report JUnit test results for all TVM Python tests #7450
SparseFillEmptyRows Op #7442
Fix Bug Which Cause Negative Left Shift Op #7433
Add support for default Ethos-N78 configuration. #6982
Set TOpPattern=kOpaque for scatter_nd #7464
Allow manual shape specification in tvmc #7366
Make spelling of "axes" consistent #7460
Get tvmc version from tvm #7478
make test_runtime_rpc use pytest.main() #7482
Specialize MutateArray in StmtFunctor. #7486
Add composite target passes for compilation and tuning #7304
Enforce -libs=thrust to allow thrust offload #7468
Add target host field for target specification #7462
Fixed minor misspelling #7499
Fix stack overflow when partially-init Node raises exception. #7481
@d-smirnov -> reviewer #7510
rename composite target "acl" #7508
Support creating Bool constants in the pattern_utils #7507
Update tags with minor fix #7448
Remove incubating from docs #7525
Enable proper error message in python package #7521
Introduce module_loader to AutoTVM. #7337
Many fixes to get unit tests passing on Windows. #7431
SparseReshape Op #7477
Add create_local_debug_runtime to micro exports #7528
Don't run non-tvm_op GraphRuntime nodes in Debug Runtime over RPC. #7512
Add test_forward_index_put to main #7542
add missing equal sign #7531
Fix typo in relay.vm.Executable #7543
fuse constant padding into conv kernels #7515
Fix: cuda codegen vectorize cast #7561
Create C-runtime-style metadata module for llvm builds #7398
Profiling TVM compiler passes #7500
Add TIR While node #7425
introduce Block and BlockRealize #7553
Support conds depend on outer loop vars inside tensorize scope #7497
Add SPIR-V lowering for While node #7574
compile engine dump tir and shape funcs #7552
Fix a flaky test #7580
Fix: install script regarding get-pip.py during docker build #7579
Add support for 20.11 Ethos-N driver stack release #7506
Fixes for using Python APIs from Rust. #7085
Add segment sum Op to relay and 7 corresponding TF Ops , fix scatter_add dynamic bug #7562
Support Bool buffer argument #7591
Fix for dynamic batch size conv2d nhwc #7598
Guarantee data input is the first argument #7592
Support negative axis for gather #7600
Support passing 64 bit scalar #7572
Fix autotuning, broken in #7337 #7566
Sparse dense tuning support with custom sketch rule #7313
BF16 support #7014
Fix bug in AutoInlineElemWise and implement AutoInlineBroadcast #7602
Add logging to diagnose flaky ci-qemu test #7610
Move SimplifyConvPad to a new pass and don't enable it by default #7603
Fix clang12 warnings #7593

Relay

Iterative A-normal Traversals #7374
Refactor where importer to support dynamic shapes. #7394
Dense with weight transform #7404
Fix missing return in scatter_nd cuda strategy #7447
Add max mode to ROI align #7440
Crash in match_exhaustion.cc when given an empty tuple pattern or constructor with no args #7459
Support roi_align NHWC layout #7463
Fix off-by-one error in BiasAddRel, use new reporting #7467
Optimize relay parser to restore calls attrs #7347
Fix GEMM converter when C is not a parameter. #7509
Enforce static dim for non-concat axis if one or more tensors have static dim #7487
Fix foldconstant involving dropout #7550
Modify some passes to not stack overflow on many lets. #7558
BiasAddRel does not check for a negative index being out of bounds #7554
Fix Bug Which Cause Negative Left Shift Op #7432
Avoid stack overflow when using PostOrderRewrite #7588
add ShapeFunc for tanh #6898
Fix relay op strategy for cuda dense int8 #7586
add ShapeFunc for one_hot op #7490

Topi

Add einsum operator #6370
Fix cuda nms handling of additional per box features #7483
Allow topi resize to accept more options #7532
disable test_shift with i8 datatype #7597

Autoscheduler

Fail to register ComputeDAG when deserializing tasks #7395
Support early_stopping per task #7377
Add sampling to dispatcher #7376
Fix distill record #7439
Fix the type inference for conv3d #7475
Fix the type inference for conv2d #7501
Autoscheduler layout rewrite pass to VM #7516
Querying and sampling in task extraction #7571
Correctly resume status #7614

Ci

Temporary increase ci timeout #7403
Add back the tests after timeout adjusted #7408
Move ci-cpu to use llvm-11 #7541
Update CI Vitis AI PyXIR version #7575
Bump ARM image version #7584

Onnx

Add CumSum operator to ONNX frontend #7391
Make the ONNX Importer More Static #7429
use checked_type instead of type_annotation #7522
fix datatype on Reciprocal op #7519

Byoc

Fix small bug preventing TRT runtime compilation for versions < 6 #7372
Refactor Verilator runtime #7406
Fix issue in Vitis AI codegen out tensor names matching & update docs and docker #7350
Make TRT runtime robust to empty or weird subgraphs #7581
Fix groups cannot divide output channel count error for deconv when groups>1 #7595

Frontend

Make keras reshape less restrictive #7446
Add support for MXNet GroupNorm #7409
get input tensor information from graph #7400
Support explicit_paddings for TF 2.x #7445
Make onnx gemm tensor C optional #7489
Support range like axis in tf.raw_ops.All for TF 2.x #7502
Support CombinedNonMaxSuppression #7520
TF V2 sparse.todense() test added #7473
Add unique operator #7441
Fix default value for is_ascend in topk #7568

Bugfix

debug operator–() in include/tvm/node/container.h #7461
Properly return and unflatten outputs from GraphExecutor #7604

Runtime

Fast path for single thread run to allow app level threading #7454
Special Memory Scope Support #7488
Move Map into runtime #7570
Add device specific timers #7472
Unify load params interface #7559
Add Object::unique() #7615

Torch

Add index_put operator #7465
Pool ops, convert strides and pool_size to int #7517
Avoid adding unnecessary slicing #7479
Add narrow operator #7535
Simplify contiguous #7544
Fix converting torch slice op with dynamic slice length #7549
Add linear operator support #7569
Support quantized mobilenet v3 from torch 1.8 #7606

Contributors Who Reviewed Pull Requests

Note: The format is name (number of activities) Disclaimer: number of activities do not directly correspond to the community’s view about the significance of contributions.

tqchen (83), junrushao1994 (40), comaniac (34), zhiics (31), mbrookhart (23), masahi (22), MarisaKirisame (20), tkonolige (19), tmoreau89 (18), leandron (14), icemelon9 (10), wweic (9), FrozenGene (9), anijain2305 (8), jroesch (8), jwfromm (8), weberlo (8), merrymercy (7), areusch (7), mbaret (7), ZihengJiang (6), slyubomirsky (6), trevor-m (6), joshpoll (6), manupa-arm (6), binarybana (6), yzhliu (5), altanh (5), adelbertc (5), u99127 (4), robo-corg (4), vinx13 (3), eqy (3), giuseros (3), codeislife99 (3), rkimball (3), electriclilies (3), siju-samuel (2), kevinthesun (2), Laurawly (2), apivovarov (2), liangfu (2), ANSHUMAN87 (2), zhreshold (2), csullivan (2), ehsanmok (2), mdw-octoml (2), mwillsey (2), leonwanghui (2), srkreddy1238 (1), kazum (1), nhynes (1), vegaluisjose (1), hlu1 (1), t-vi (1), ajtulloch (1), lhutton1 (1), jcf94 (1), sxjscience (1), alex-weaver (1), xqdan (1), gussmith23 (1), grwlf (1), ymwangg (1), gromero (1), zxybazh (1), monklof (1), Leo-arm (1)

Contributors Whose Pull Requests were Updated

Note: The format is name (number of activities)

jroesch (66), areusch (14), masahi (12), comaniac (10), codeislife99 (10), mbrookhart (9), tqchen (7), trevor-m (7), tkonolige (5), monklof (5), apivovarov (4), electriclilies (4), zxybazh (3), jwfromm (2), vegaluisjose (2), slyubomirsky (2), leandron (2), ANSHUMAN87 (2), cbalint13 (2), rkimball (2), altanh (2), hypercubestart (2), ymwangg (2), jtuyls (2), NicolaLancellotti (2), mshr-h (2), merrymercy (1), ZihengJiang (1), MarisaKirisame (1), yzhliu (1), anijain2305 (1), vinx13 (1), tmoreau89 (1), kevinthesun (1), FrozenGene (1), yongwww (1), d-smirnov (1), u99127 (1), windclarion (1), gussmith23 (1), csullivan (1), zxy844288792 (1), tristan-arm (1), xutianming (1), gromero (1), yinghai (1), euntaik (1), Beya2019 (1), Johnson9009 (1), alexwong (1), domin1985 (1), echuraev (1), lsy643 (1), dlexplorer (1), CircleSpin (1), grant-arm (1), hanke580 (1), MatthewARM (1), vinceab (1), Wheest (1)