TVM Monthly - Sep 2020

As discussed by the TVM PPMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

Community

The community welcomes new committers @comaniac @vegaluis and new reviewes @ihutton1, @hypercubestart .

This forum got 12.3k pageviews, 2.1k user visits in the last month.

We also delivered the v0.7 release: https://github.com/apache/incubator-tvm/releases/tag/v0.7.0 .

Pull Requests

The below is high-level summary of the PRs closed in the last month grouped by area.

Torch

  • Fix aten::max and aten::min conversion #6372
  • Support logsumexp, clean up unnecessary infer_shape usage #6374
  • reshape op with non constant shape argument #6411
  • Miscellaneous fix, enable some VM GPU tests #6418
  • Clean up usage of try … infer_value() … except #6504
  • Disable a flaky test #6585
  • Object detection support update for PyTorch 1.6 #6659
  • Necessary workaround to prepare for 1.6 update #6602

BYOC

  • Support input nodes with multiple entries #6368
  • Introduce further operator support #6355
  • Add maximum support for float32 #6506
  • Fix tests for new module API #6560
  • Support add operation #6532
  • Support control flow in annotate_target #6641
  • TensorRT BYOC integration #6395

Relay

  • Enhance relay.split(), allow splitted dim to be dynamic #6289
  • Fix the FoldConstant Regression for VTA #6377
  • Fix Type Arguments not Attached #6385
  • Dynamic UpSampling3D Op #6353
  • add conv2d_transpose alter layout #6358
  • Allow StructuralEqual/Hash via Var.vid #6424
  • Fix stack op axis check, support torch::stack conversion for a static list #6433
  • Add Defunctionalization Pass #6400
  • Fixed bug in quantized conv2d. #6420
  • Fix Reshape Compute #6396
  • Advanced indexing #6388
  • Some backend improvements for PT OD models #6464
  • Mix mode context analysis #6403
  • Improve doc for nn.dense #6508
  • Allow dynamic batch for arm conv2d #6509
  • roi_align operator alter layout #6443
  • Show yolo detection result in text. #6367
  • Make missing desired layout non-fatal #6553
  • Support "dot" and "LogisticRegressionOutput" #6542
  • Fix Strided Slice Infer Layout #6621
  • Enable heterogeneous execution for Relay VM #6337
  • Refactor InferType to work on whole module, and use new diagnostics. #6274
  • Support broadcast_like #6561
  • Allow A to B broadcasting of batch_matmul and reverse strided slice #6681
  • support i64 indices #6143
  • Change some passes to mix mode #6695
  • Dynamic conv2d batch size for cuda #6598
  • Fix MXNet frontend to support NLP backbones in GluonNLP #6699
  • Loop Support #6700
  • Minor fix for some TF OD models #6729

Fixes

  • acquire gil while finalizing PackedFunc #6378
  • Refactor tests to run on either the GPU or CPU #6331
  • Make docs build again when not everything is enabled #6386
  • Fix the docker binary cache location #6390
  • Avoid adding annotation twice for ConstantNode #6364
  • Support for 'SAME' Padding option for TRANSPOSE_CONV operator of TFLite. #6381
  • Use CFBridgeRetain for retaining the allocated resource #6393
  • Support scalar inputs in where op #6383
  • Add safe up/downcasting to the Rust object system #6384
  • Add layout_transform, clip and expand_dims in onnx converter #6366
  • Fix the error when running tests with default targets #6394
  • ACL integration bugfix: "verify" call parameter name changed #6382
  • Implemented MATRIX_DIAG Operator for TFLite. #6397
  • Remove comparison of unsigned expression < 0 warning #6319
  • Dynamic Strided Slice #6316
  • Switch Windows CI to build Release instead of Debug #6427
  • Address issue #6415 using compiler-rt half-float function. #6431
  • set MTLBuffer purgeable state (#6376) #6438
  • Improve the error reporting in build.rs files by using anyhow. #6401
  • Target Tags, Composite Target and Unified Interface #6369
  • CUDA: broaden path detection #6444
  • Fix broadcast shape #6422
  • ROCm: use GcnArch for mcpu and ApiVersion to select code object version #6447
  • Convert all Python code w/o CI #6448
  • Fix MSVC warnings #6450
  • Fix constant folding folding (big) constant in primitive function. #6436
  • fix: BooleanToTranspose function definition conflict #6452
  • Add How to deploy graph runtime example under new module factory #6459
  • Update RPC module to enable remote linking. #6462
  • Improve FindLLVM to handle llvm-prefix with space. #6466
  • Add scripts for applying Black to the Python code. #6437
  • fix: remove anoymous namespace and rename BooleanToTranspose #6465
  • add aten::pixel_shuffle implementation (#6328) #6468
  • Fix black script for Python formatting #6469
  • use macro to replace hardcode number #6365
  • µTVM RPC server and Part 1 of AutoTVM compilation infrastructure #6334
  • Ignore deleted files when linting #6484
  • Fix typos in Ansor #6425
  • Fix python formatting issue #6491
  • GraphRuntime: Update the tutorials to the module-based interface #6482
  • Update to Vivado 2020.1 and Pynq 2.5 #6402
  • Eliminate python dependence from bitstream build TCL script #6495
  • black format master #6494
  • enhance build script for optional vta dep #6497
  • RPC server compilation fix in Windows #6498
  • Switch CRC-CCITT libraries #6499
  • Hybrid Script Improvement #6507
  • Add several op mapping in PyTorch frontend #6472
  • Add beagleboard ai, thunderx and stm32mp1 to the arm_cpu target. #6501
  • Update CI badge location #6517
  • bugfix in BinaryBroadcastLayout + unit test #6513
  • Use fmt off to disable problematic black fmt #6519
  • Enable more warnings when compiling with clang 10.0 or greater #6456
  • fix libtvm build dependencies when USE_MICRO is ON #6524
  • Change some tutorial text #6514
  • Add the SYSTEM keyword to all cmake include_directories commands #6531
  • Lazily import micro when starting an RPC server #6505
  • Fixes AttributeError during ConvertLayout to NHWC #6419
  • Fix qnn.conv2d layout conversion too many values to unpack #6442
  • Fix Some Failed Tutorials of The Issue #6453 #6534
  • Added 'offsets' and 'alignment' attributes to MATRIX_SET_DIAG. #6429
  • Add alternate cublaslt library name. CUDA 11.0 uses cublasLt. #6541
  • Allow convert Context to ArgValue #6544
  • update webgpu api #6547
  • Added dilation_value attribute to dilate operator. #6550
  • Generalize the use of booleans to support all cmake boolean values. #6515
  • Feat(frontend-pytorch): Add input types argument and Support cast to … #6546
  • Fix rewrite_simplify tir::builtin::shift_left #6555
  • Add proper cmake PATHS when multiple NAMES. #6558
  • Rename tvm.hybrid.script to tvm.script. #6522
  • Fix misprint in demo.cc during initializing of picture tensor data #6566
  • Remove settings about SGX in config.cmake #6530
  • fix CMAKE flag name + update documentation #6567
  • Fix android runtime error #6575
  • Support rocblas_sgemm_strided_batched #6579
  • NDArray CopyFrom/To Bytes always synchronize #6586
  • properly pass through command-line args in docker/bash.sh #6599
  • Bring Your Own Datatypes #5812
  • add black-format to docker/lint.sh, suppport in-place format #6601
  • Fix parsing op string attributes #6605
  • Add ci_qemu docker image #6485
  • Simplify reduce expression in te.gradient #6611
  • Missing documentation dependency 'autodocsumm' on docs/README.txt #6595
  • Bump version to 0.7.0 #6614
  • Updated runtime to run under FreeBSD. #6600
  • Update NEWS.md for v0.7 #6613
  • Fixes #6608: CHECK(data != nullptr) causes type checking to fail #6610
  • Update to 20.08 version of the ethosn-driver. #6606
  • Version for v0.8 cycle #6615
  • Dynamic ONNX Importer #6351
  • Improve NDArray, GraphRt, and Relay bindings #6563
  • Fix example code #6627
  • Add dot product support for quantized convolution. #6445
  • Fix a bug with Alter Op Layout #6626
  • Link demo_* targets with LDFLAGS and also with -lm. #6636
  • Add qemu build step to CI #6644
  • Add a test for assymetric padding in ONNX conv and fix Importer to support it #6646
  • Missing header for GraphRuntimeFactory in android_rpc #6648
  • Fix leakyReLU support for CoreML #6651
  • Add Range op to ONNX, make tvm arange shape_func support negative steps #6647
  • Avoid use of builtin math functions #6630
  • Keep fixed dim when unifying dynamic shape (#5795)" #6658
  • Faster sparse_dense on GPUs #6580
  • Fix typographical error. #6664
  • filter out error features #5952
  • don't validate AttrInitEntry until a value has attempted to be set #6672
  • TF argmax - handling int64 datatype #6674
  • Call InferType explicitly in coreml test #6676
  • Adjust Vulkan queue selection and creation logic #6662
  • Revert #5238 #6680
  • Fix format error in integrate.rst #6677
  • Introduce iterator (quasi)affine map detection. #6667
  • util => utils for consistency in the project. #6684
  • int32 pooling with int64 shapes #6687
  • Add µTVM Zephyr support + QEMU regression test #6603
  • Update CI CPU and GPU images based on new Docker build files. #6690
  • Fix detection of crop in convert_batch_to_space_nd #6670
  • Fix tutorial broken by Docker build #6694
  • Recover windows support for the latest LLVM #6698
  • Resolve more warnings in msvc #6702
  • Add cloudpickle dependency to docker images #6701
  • Refactor diagnostic to avoid circular dependencies #6692
  • More robust dll loading behavior after python3.8 #6707
  • Fix the Type bug in ConvertSSA. #6709
  • Support multiple cache read and fix bugs #6686
  • Auto scheduler tutorial failure on CI #6723
  • Fix InferCorrectLayout for dynamic upsampling and add a regression test #6712
  • refactor #6734
  • Create fixed vector size according to latest LLVM12+ changes #6717
  • Add missing python dependency in the setup #6375
  • fix compilation error when setting USE_RELAY_DEBUG #6380
  • Hot fix for Windows CI #6434
  • Add tvm.testing to the docs #6458
  • Save tensor size with alignment #6487
  • Fix a typo in hybrid script tutorial. #6525
  • fix the python script for building resnet (#6526) #6527
  • Print warning when all autotvm tasks fail with errors #6612
  • More descriptive error message when an autotvm task is not found #6652
  • Skip microtvm tests if microtvm is not built #6693
  • Fix cublas batch matmul #6715
  • Use set_property with append flag instead of set_target_properties #6725

Frontend

  • Improve TensorFlow control flow nodes ordering #6387
  • Improve Pytorch frontend for object detection models #6449
  • Add Pytorch OD tutorial #6500
  • Added broadcasting to prelu alpha. #6549
  • Fix TF 1.15 conv2d_transpose parsing #6589

Ci

  • Add Vitis-AI docker installation #6342
  • Add black to lint docker image #6451
  • Cancel previous build if a new commit is pushed to a PR #6518
  • remove old pylint 1.9.4 from docker installation script #6538
  • Update ci-cpu to the latest #6632
  • add python environment setup as part of cpp unittest runner script #6639
  • make sure graphviz is on both ci-cpu and ci-gpu images #6645
  • Move to use main as the default #6665
  • Set main as default in github actions #6669
  • Install xgboost>=1.1.0 in CI container #6679
  • CI docker staging update to latest #6708

Onnx

  • Add Clip importer to handle when min/max are provided as inputs. #6251
  • Add support for GatherElements conversion #6446
  • Update Slice op conversion to take strides into account, clean up tests #6467

Ansor

  • Enable random fill and CPU cache flush for AutoTVM and Ansor #6391
  • Phase 2: Layout Rewrite in AutoScheduler #6297
  • Using the template-free auto-scheduler on CPU #6488
  • Auto-scheduler tutorial for GPU and necessary refactor/fix #6512
  • Parallel the InitPopulation #6529
  • Turn on USE_RANDOM by default #6562
  • Bug fix for compute at mutation error #6557
  • Support multiple output ops and fix Python API printing #6584

Test

  • Fix Some Failed Test Cases and Tutorials of The Issue #6453 #6454
  • Remove unintentional pytest dependency #6399
  • Temporary disable test_mutate_parallel #6572
  • improve TEDD tests to also run on CPU Docker image #6643
  • Address flaky error in test_any #6705

Topi

  • For 1D loop, make an outer loop parallel after axis split #6455
  • Fix missing import in bifrost schedule #6479
  • Fixed a typo in topi key #6502
  • Fix declaration_conv2d_transpose_impl #6428
  • Group conv2d NHWC op implementation #6510
  • Tiny bug fix for non-fp32 datatypes in conv2d_transpose. #6593
  • Allow batch_matmul to broadcast along batch dimension. #6616

Community

  • comaniac -> Committer #6463
  • hypercubestart -> Reviewer #6511
  • Add Ziheng's key for ASF release #6552
  • Zhi's key for ASF release #6554
  • vegaluisjose -> Committer #6582
  • lhutton1 -> Reviewer #6461
  • areusch -> Reviewer #6637
  • junrushao1994 -> committer #6719

Doc

Tvmc

  • linting error on onnx command line driver frontend #6536
  • Introduce 'tune' subcommand (part 3/4) #6537
  • unify all logs on a single logger 'TVMC' #6577
  • fix command line argument variable name in 'compile' #6574
  • command line driver 'compile' (part 2/4) #6302
  • Getting started tutorial for TVMC #6597
  • fail gracefully in case no subcommand is provided #6625
  • Introduce 'run' subcommand (part 4/4) #6578

Autoscheduler

  • Improve hyperlinks in the tutorial #6521
  • Improve the rule of mutating parallel granularity #6568
  • Improve the GPU tutorial by deleting measure_ctx earlier #6660
  • Improve test cases #6657
  • Fix a bug in thread binding #6683
  • Add task scheduler #6663
  • Use tempfile in tutorials #6728
  • Guarantee init population sampling outputs a valid set #6713

Contributors Who Reviewed Pull Requests

Note: The format is name (number of activities) Disclaimer: number of activities do not directly correspond to the community’s view about the significance of contributions.

tqchen (88), zhiics (48), comaniac (40), junrushao1994 (38), masahi (16), ZihengJiang (16), tmoreau89 (16), leandron (15), anijain2305 (12), jroesch (12), t-vi (10), mbrookhart (9), FrozenGene (9), siju-samuel (8), kevinthesun (8), jcf94 (8), yongwww (7), icemelon9 (6), merrymercy (6), mbaret (6), yzhliu (5), areusch (5), electriclilies (5), jwfromm (4), u99127 (4), MarisaKirisame (3), cbalint13 (3), binarybana (3), liangfu (2), lhutton1 (2), tkonolige (2), mwillsey (2), manupa-arm (2), Leo-arm (2), vinx13 (1), kparzysz-quic (1), vegaluisjose (1), ANSHUMAN87 (1), antinucleon (1), xqdan (1), rkimball (1), gussmith23 (1), Hzfengsy (1), michalpiszczek (1), robo-corg (1), jtuyls (1)

Contributors Whose Pull Requests were Updated

Note: The format is name (number of activities)

tqchen (17), masahi (11), lhutton1 (11), merrymercy (9), rkimball (9), kevinthesun (8), tmoreau89 (6), jainris (6), jroesch (5), areusch (5), zhiics (4), yzhliu (4), leandron (4), tkonolige (4), comaniac (3), junrushao1994 (3), lixiaoquan (3), t-vi (3), trevor-m (3), jcf94 (3), roastduck (3), hypercubestart (3), Johnson9009 (3), Beya2019 (3), ZihengJiang (2), mbrookhart (2), mbaret (2), cbalint13 (2), d-smirnov (2), spectrometerHBH (2), csullivan (2), tom-gall (2), cloud-mxd (2), intheworld (2), icemelon9 (1), MarisaKirisame (1), anijain2305 (1), FrozenGene (1), jwfromm (1), yongwww (1), huajsj (1), giuseros (1), windclarion (1), electriclilies (1), zxy844288792 (1), mwillsey (1), xutianming (1), wjliu1998 (1), wrongtest (1), qixiuai (1), 12101111 (1), minminsun (1), jacobpostman (1), nolanliou (1), WenheLI (1), XIAO-XIA (1), yzh119 (1), anilmartha (1), dlexplorer (1), euntaik (1), DemonGiggle (1), insop (1)

1 Like