TVM Monthly - Sep 2020

ziheng · October 23, 2020, 8:30pm

As discussed by the TVM PPMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

Community

The community welcomes new committers @comaniac @vegaluis and new reviewes @ihutton1, @hypercubestart .

This forum got 12.3k pageviews, 2.1k user visits in the last month.

We also delivered the v0.7 release: https://github.com/apache/incubator-tvm/releases/tag/v0.7.0 .

Pull Requests

The below is high-level summary of the PRs closed in the last month grouped by area.

Torch

Fix aten::max and aten::min conversion #6372
Support logsumexp, clean up unnecessary infer_shape usage #6374
reshape op with non constant shape argument #6411
Miscellaneous fix, enable some VM GPU tests #6418
Clean up usage of try … infer_value() … except #6504
Disable a flaky test #6585
Object detection support update for PyTorch 1.6 #6659
Necessary workaround to prepare for 1.6 update #6602

BYOC

Support input nodes with multiple entries #6368
Introduce further operator support #6355
Add maximum support for float32 #6506
Fix tests for new module API #6560
Support add operation #6532
Support control flow in annotate_target #6641
TensorRT BYOC integration #6395

Relay

Enhance relay.split(), allow splitted dim to be dynamic #6289
Fix the FoldConstant Regression for VTA #6377
Fix Type Arguments not Attached #6385
Dynamic UpSampling3D Op #6353
add conv2d_transpose alter layout #6358
Allow StructuralEqual/Hash via Var.vid #6424
Fix stack op axis check, support torch::stack conversion for a static list #6433
Add Defunctionalization Pass #6400
Fixed bug in quantized conv2d. #6420
Fix Reshape Compute #6396
Advanced indexing #6388
Some backend improvements for PT OD models #6464
Mix mode context analysis #6403
Improve doc for nn.dense #6508
Allow dynamic batch for arm conv2d #6509
roi_align operator alter layout #6443
Show yolo detection result in text. #6367
Make missing desired layout non-fatal #6553
Support "dot" and "LogisticRegressionOutput" #6542
Fix Strided Slice Infer Layout #6621
Enable heterogeneous execution for Relay VM #6337
Refactor InferType to work on whole module, and use new diagnostics. #6274
Support broadcast_like #6561
Allow A to B broadcasting of batch_matmul and reverse strided slice #6681
support i64 indices #6143
Change some passes to mix mode #6695
Dynamic conv2d batch size for cuda #6598
Fix MXNet frontend to support NLP backbones in GluonNLP #6699
Loop Support #6700
Minor fix for some TF OD models #6729

Fixes

acquire gil while finalizing PackedFunc #6378
Refactor tests to run on either the GPU or CPU #6331
Make docs build again when not everything is enabled #6386
Fix the docker binary cache location #6390
Avoid adding annotation twice for ConstantNode #6364
Support for 'SAME' Padding option for TRANSPOSE_CONV operator of TFLite. #6381
Use CFBridgeRetain for retaining the allocated resource #6393
Support scalar inputs in where op #6383
Add safe up/downcasting to the Rust object system #6384
Add layout_transform, clip and expand_dims in onnx converter #6366
Fix the error when running tests with default targets #6394
ACL integration bugfix: "verify" call parameter name changed #6382
Implemented MATRIX_DIAG Operator for TFLite. #6397
Remove comparison of unsigned expression < 0 warning #6319
Dynamic Strided Slice #6316
Switch Windows CI to build Release instead of Debug #6427
Address issue #6415 using compiler-rt half-float function. #6431
set MTLBuffer purgeable state (#6376) #6438
Improve the error reporting in build.rs files by using anyhow. #6401
Target Tags, Composite Target and Unified Interface #6369
CUDA: broaden path detection #6444
Fix broadcast shape #6422
ROCm: use GcnArch for mcpu and ApiVersion to select code object version #6447
Convert all Python code w/o CI #6448
Fix MSVC warnings #6450
Fix constant folding folding (big) constant in primitive function. #6436
fix: BooleanToTranspose function definition conflict #6452
Add How to deploy graph runtime example under new module factory #6459
Update RPC module to enable remote linking. #6462
Improve FindLLVM to handle llvm-prefix with space. #6466
Add scripts for applying Black to the Python code. #6437
fix: remove anoymous namespace and rename BooleanToTranspose #6465
add aten::pixel_shuffle implementation (#6328) #6468
Fix black script for Python formatting #6469
use macro to replace hardcode number #6365
µTVM RPC server and Part 1 of AutoTVM compilation infrastructure #6334
Ignore deleted files when linting #6484
Fix typos in Ansor #6425
Fix python formatting issue #6491
GraphRuntime: Update the tutorials to the module-based interface #6482
Update to Vivado 2020.1 and Pynq 2.5 #6402
Eliminate python dependence from bitstream build TCL script #6495
black format master #6494
enhance build script for optional vta dep #6497
RPC server compilation fix in Windows #6498
Switch CRC-CCITT libraries #6499
Hybrid Script Improvement #6507
Add several op mapping in PyTorch frontend #6472
Add beagleboard ai, thunderx and stm32mp1 to the arm_cpu target. #6501
Update CI badge location #6517
bugfix in BinaryBroadcastLayout + unit test #6513
Use fmt off to disable problematic black fmt #6519
Enable more warnings when compiling with clang 10.0 or greater #6456
fix libtvm build dependencies when USE_MICRO is ON #6524
Change some tutorial text #6514
Add the SYSTEM keyword to all cmake include_directories commands #6531
Lazily import micro when starting an RPC server #6505
Fixes AttributeError during ConvertLayout to NHWC #6419
Fix qnn.conv2d layout conversion too many values to unpack #6442
Fix Some Failed Tutorials of The Issue #6453 #6534
Added 'offsets' and 'alignment' attributes to MATRIX_SET_DIAG. #6429
Add alternate cublaslt library name. CUDA 11.0 uses cublasLt. #6541
Allow convert Context to ArgValue #6544
update webgpu api #6547
Added dilation_value attribute to dilate operator. #6550
Generalize the use of booleans to support all cmake boolean values. #6515
Feat(frontend-pytorch): Add input types argument and Support cast to … #6546
Fix rewrite_simplify tir::builtin::shift_left #6555
Add proper cmake PATHS when multiple NAMES. #6558
Rename tvm.hybrid.script to tvm.script. #6522
Fix misprint in demo.cc during initializing of picture tensor data #6566
Remove settings about SGX in config.cmake #6530
fix CMAKE flag name + update documentation #6567
Fix android runtime error #6575
Support rocblas_sgemm_strided_batched #6579
NDArray CopyFrom/To Bytes always synchronize #6586
properly pass through command-line args in docker/bash.sh #6599
Bring Your Own Datatypes #5812
add black-format to docker/lint.sh, suppport in-place format #6601
Fix parsing op string attributes #6605
Add ci_qemu docker image #6485
Simplify reduce expression in te.gradient #6611
Missing documentation dependency 'autodocsumm' on docs/README.txt #6595
Bump version to 0.7.0 #6614
Updated runtime to run under FreeBSD. #6600
Update NEWS.md for v0.7 #6613
Fixes #6608: CHECK(data != nullptr) causes type checking to fail #6610
Update to 20.08 version of the ethosn-driver. #6606
Version for v0.8 cycle #6615
Dynamic ONNX Importer #6351
Improve NDArray, GraphRt, and Relay bindings #6563
Fix example code #6627
Add dot product support for quantized convolution. #6445
Fix a bug with Alter Op Layout #6626
Link demo_* targets with LDFLAGS and also with -lm. #6636
Add qemu build step to CI #6644
Add a test for assymetric padding in ONNX conv and fix Importer to support it #6646
Missing header for GraphRuntimeFactory in android_rpc #6648
Fix leakyReLU support for CoreML #6651
Add Range op to ONNX, make tvm arange shape_func support negative steps #6647
Avoid use of builtin math functions #6630
Keep fixed dim when unifying dynamic shape (#5795)" #6658
Faster sparse_dense on GPUs #6580
Fix typographical error. #6664
filter out error features #5952
don't validate AttrInitEntry until a value has attempted to be set #6672
TF argmax - handling int64 datatype #6674
Call InferType explicitly in coreml test #6676
Adjust Vulkan queue selection and creation logic #6662
Revert #5238 #6680
Fix format error in integrate.rst #6677
Introduce iterator (quasi)affine map detection. #6667
util => utils for consistency in the project. #6684
int32 pooling with int64 shapes #6687
Add µTVM Zephyr support + QEMU regression test #6603
Update CI CPU and GPU images based on new Docker build files. #6690
Fix detection of crop in convert_batch_to_space_nd #6670
Fix tutorial broken by Docker build #6694
Recover windows support for the latest LLVM #6698
Resolve more warnings in msvc #6702
Add cloudpickle dependency to docker images #6701
Refactor diagnostic to avoid circular dependencies #6692
More robust dll loading behavior after python3.8 #6707
Fix the Type bug in ConvertSSA. #6709
Support multiple cache read and fix bugs #6686
Auto scheduler tutorial failure on CI #6723
Fix InferCorrectLayout for dynamic upsampling and add a regression test #6712
refactor #6734
Create fixed vector size according to latest LLVM12+ changes #6717
Add missing python dependency in the setup #6375
fix compilation error when setting USE_RELAY_DEBUG #6380
Hot fix for Windows CI #6434
Add tvm.testing to the docs #6458
Save tensor size with alignment #6487
Fix a typo in hybrid script tutorial. #6525
fix the python script for building resnet (#6526) #6527
Print warning when all autotvm tasks fail with errors #6612
More descriptive error message when an autotvm task is not found #6652
Skip microtvm tests if microtvm is not built #6693
Fix cublas batch matmul #6715
Use set_property with append flag instead of set_target_properties #6725

Frontend

Improve TensorFlow control flow nodes ordering #6387
Improve Pytorch frontend for object detection models #6449
Add Pytorch OD tutorial #6500
Added broadcasting to prelu alpha. #6549
Fix TF 1.15 conv2d_transpose parsing #6589

Ci

Add Vitis-AI docker installation #6342
Add black to lint docker image #6451
Cancel previous build if a new commit is pushed to a PR #6518
remove old pylint 1.9.4 from docker installation script #6538
Update ci-cpu to the latest #6632
add python environment setup as part of cpp unittest runner script #6639
make sure graphviz is on both ci-cpu and ci-gpu images #6645
Move to use main as the default #6665
Set main as default in github actions #6669
Install xgboost>=1.1.0 in CI container #6679
CI docker staging update to latest #6708

Onnx

Add Clip importer to handle when min/max are provided as inputs. #6251
Add support for GatherElements conversion #6446
Update Slice op conversion to take strides into account, clean up tests #6467

Ansor

Enable random fill and CPU cache flush for AutoTVM and Ansor #6391
Phase 2: Layout Rewrite in AutoScheduler #6297
Using the template-free auto-scheduler on CPU #6488
Auto-scheduler tutorial for GPU and necessary refactor/fix #6512
Parallel the InitPopulation #6529
Turn on USE_RANDOM by default #6562
Bug fix for compute at mutation error #6557
Support multiple output ops and fix Python API printing #6584

Test

Fix Some Failed Test Cases and Tutorials of The Issue #6453 #6454
Remove unintentional pytest dependency #6399
Temporary disable test_mutate_parallel #6572
improve TEDD tests to also run on CPU Docker image #6643
Address flaky error in test_any #6705

Topi

For 1D loop, make an outer loop parallel after axis split #6455
Fix missing import in bifrost schedule #6479
Fixed a typo in topi key #6502
Fix declaration_conv2d_transpose_impl #6428
Group conv2d NHWC op implementation #6510
Tiny bug fix for non-fp32 datatypes in conv2d_transpose. #6593
Allow batch_matmul to broadcast along batch dimension. #6616

Community

comaniac -> Committer #6463
hypercubestart -> Reviewer #6511
Add Ziheng's key for ASF release #6552
Zhi's key for ASF release #6554
vegaluisjose -> Committer #6582
lhutton1 -> Reviewer #6461
areusch -> Reviewer #6637
junrushao1994 -> committer #6719

Doc

Fix Some Broken Web Links #6475
Fix missing te in the code example #6569
Update release document #6573
add KEYS to downloads.apache.org #6581

Tvmc

linting error on onnx command line driver frontend #6536
Introduce 'tune' subcommand (part 3/4) #6537
unify all logs on a single logger 'TVMC' #6577
fix command line argument variable name in 'compile' #6574
command line driver 'compile' (part 2/4) #6302
Getting started tutorial for TVMC #6597
fail gracefully in case no subcommand is provided #6625
Introduce 'run' subcommand (part 4/4) #6578

Autoscheduler

Improve hyperlinks in the tutorial #6521
Improve the rule of mutating parallel granularity #6568
Improve the GPU tutorial by deleting measure_ctx earlier #6660
Improve test cases #6657
Fix a bug in thread binding #6683
Add task scheduler #6663
Use tempfile in tutorials #6728
Guarantee init population sampling outputs a valid set #6713

Contributors Who Reviewed Pull Requests

Note: The format is name (number of activities) Disclaimer: number of activities do not directly correspond to the community’s view about the significance of contributions.

tqchen (88), zhiics (48), comaniac (40), junrushao1994 (38), masahi (16), ZihengJiang (16), tmoreau89 (16), leandron (15), anijain2305 (12), jroesch (12), t-vi (10), mbrookhart (9), FrozenGene (9), siju-samuel (8), kevinthesun (8), jcf94 (8), yongwww (7), icemelon9 (6), merrymercy (6), mbaret (6), yzhliu (5), areusch (5), electriclilies (5), jwfromm (4), u99127 (4), MarisaKirisame (3), cbalint13 (3), binarybana (3), liangfu (2), lhutton1 (2), tkonolige (2), mwillsey (2), manupa-arm (2), Leo-arm (2), vinx13 (1), kparzysz-quic (1), vegaluisjose (1), ANSHUMAN87 (1), antinucleon (1), xqdan (1), rkimball (1), gussmith23 (1), Hzfengsy (1), michalpiszczek (1), robo-corg (1), jtuyls (1)

Contributors Whose Pull Requests were Updated

Note: The format is name (number of activities)

tqchen (17), masahi (11), lhutton1 (11), merrymercy (9), rkimball (9), kevinthesun (8), tmoreau89 (6), jainris (6), jroesch (5), areusch (5), zhiics (4), yzhliu (4), leandron (4), tkonolige (4), comaniac (3), junrushao1994 (3), lixiaoquan (3), t-vi (3), trevor-m (3), jcf94 (3), roastduck (3), hypercubestart (3), Johnson9009 (3), Beya2019 (3), ZihengJiang (2), mbrookhart (2), mbaret (2), cbalint13 (2), d-smirnov (2), spectrometerHBH (2), csullivan (2), tom-gall (2), cloud-mxd (2), intheworld (2), icemelon9 (1), MarisaKirisame (1), anijain2305 (1), FrozenGene (1), jwfromm (1), yongwww (1), huajsj (1), giuseros (1), windclarion (1), electriclilies (1), zxy844288792 (1), mwillsey (1), xutianming (1), wjliu1998 (1), wrongtest (1), qixiuai (1), 12101111 (1), minminsun (1), jacobpostman (1), nolanliou (1), WenheLI (1), XIAO-XIA (1), yzh119 (1), anilmartha (1), dlexplorer (1), euntaik (1), DemonGiggle (1), insop (1)