Hi, I’m trying to understand how to use sparse inference in TVM. This is a standalone script demonstrating what I’m trying to do:
import onnx
import onnx.shape_inference
import tvm
from tvm import relay
onnx_model = onnx.load("/home/vaclav/scripts/dilated-conv-stack.onnx")
# onnx_model = onnx.load("/home/vaclav/scripts/dilated-conv-stack-ib.onnx")
input_shape = [
d.dim_value for d in onnx_model.graph.input[0].type.tensor_type.shape.dim
]
input_name = onnx_model.graph.input[0].name
shape_dict = {input_name: input_shape}
mod, params = relay.frontend.from_onnx(onnx_model, shape_dict)
bs_r, bs_c = 4, 1
sparsity = 0.9
layout = "NCHW"
from tvm.relay import data_dep_optimization as ddo
mod, params = ddo.simplify_fc_transpose.convert(mod["main"], params)
# The below is similar to thesis.runtimes.tvm.convert_model_dense_to_sparse,
#but I've set the threshold to 0.0 to enforce sparsity.
from tvm.topi.sparse.utils import random_sparse_conv2d_params
# Manually replace the parameters of conv2d to sparse tensors
params = random_sparse_conv2d_params(
mod, params, bs_r=bs_r, bs_c=bs_c, density=1 - sparsity, layout=layout
)
# convert dense conv2d to sparse conv2d
mod, params = ddo.bsr_conv2d.convert(
mod, params, (bs_r, bs_c), sparsity_threshold=0.0, layout=layout
)
sparsified = False
for k in params.keys():
# The new parameters should have some keys ending with .indices, .data and .indptr
if k.endswith(".indices"):
sparsified = True
if not sparsified:
print("Didn't crash but didn't sparsify!")
I’ve succeeded in sparsifying a simple CNN (inverted-bottleneck.onnx). On a network with some dilations (dilated-conv-stack.onnx) there is no crash but nothing is sparsified. And when I tried to do the same with a network using MobileNet-v2-style inverted bottleneck layers (dilated-conv-stack-ib.onnx), I get an error:
Traceback (most recent call last):
File "thesis/scripts/tvm_sparsity_conv_bug.py", line 37, in <module>
mod, params = ddo.bsr_conv2d.convert(
File "/home/vaclav/venv3.8/lib/python3.8/site-packages/tvm-0.8.0-py3.8-linux-x86_64.egg/tvm/relay/data_dep_optimization/bsr_conv2d.py", line 52, in convert
weight_info = process_params(func, params, blocksize, sparsity_threshold, layout, kernel_size)
File "/home/vaclav/venv3.8/lib/python3.8/site-packages/tvm-0.8.0-py3.8-linux-x86_64.egg/tvm/relay/analysis/sparse_conv2d.py", line 108, in process_params
sparse_weight = sp.bsr_matrix(w_np, blocksize=block_size)
File "/home/vaclav/venv3.8/lib/python3.8/site-packages/scipy/sparse/bsr.py", line 185, in __init__
arg1 = coo_matrix(arg1, dtype=dtype).tobsr(blocksize=blocksize)
File "/home/vaclav/venv3.8/lib/python3.8/site-packages/scipy/sparse/base.py", line 933, in tobsr
return self.tocsr(copy=False).tobsr(blocksize=blocksize, copy=copy)
File "/home/vaclav/venv3.8/lib/python3.8/site-packages/scipy/sparse/csr.py", line 213, in tobsr
raise ValueError('invalid blocksize %s' % blocksize)
TypeError: not all arguments converted during string formatting
Here are three networks that I tried this on. inverted-bottleneck.onnx works fine, dilated-conv-stack.onnx is not sparsified at all (presumably because the kernel size is not 1 for any of the layers?), and dilated-conv-stack-ib.onnx totally breaks for some reason.
Does anybody have an idea what the problem might be? Thanks! My TVM version is 0.8.0, Python 3.8.12.