TVM transformer pruning results in slowdown

I’ve tried to run the code provided in Deploy a Hugging Face Pruned Model on CPU. The example result showed an approximate speedup of x2.5 on a CPU, however when I run the same code, instead of a speedup, the results show a slowdown.

Example results of running the code

Dense Model Benchmark:

One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.

Runtime: 563.27 ms (73.29 ms)

Block Sparse Model with 1x1 blocks:

Runtime: 805.21 ms (4.26 ms)

The machine is a laptop running a Fedora 33 Cinnamon.

Using Python 3.8.5 Running llvm version 11.0.0

Results from <<lscpu>>

CPU info:

Architecture: x86_64

CPU op-mode(s): 32-bit, 64-bit

Byte Order: Little Endian

Address sizes: 39 bits physical, 48 bits virtual

CPU(s): 8

On-line CPU(s) list: 0-7

Thread(s) per core: 2

Core(s) per socket: 4

Socket(s): 1

NUMA node(s): 1

Vendor ID: GenuineIntel

CPU family: 6

Model: 142

Model name: Intel(R) Core™ i5-8250U CPU @ 1.60GHz

Stepping: 10

CPU MHz: 800.152

CPU max MHz: 3400.0000

CPU min MHz: 400.0000

BogoMIPS: 3600.00

Virtualization: VT-x

L1d cache: 128 KiB

L1i cache: 128 KiB

L2 cache: 1 MiB

L3 cache: 6 MiB

NUMA node0 CPU(s): 0-7

Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled

Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cach e flushes, SMT vulnerable

Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable

Vulnerability Meltdown: Mitigation; PTI

Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled v ia prctl and seccomp

Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization

Vulnerability Spectre v2: Mitigation; Full generic retpoline, IBPB condit ional, IBRS_FW, STIBP conditional, RSB filling

Vulnerability Srbds: Mitigation; Microcode

Vulnerability Tsx async abort: Not affected

Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtr r pge mca cmov pat pse36 clflush dts acpi mmx f xsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rd tscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperf mperf pni pclmulqdq dtes64 monitor ds_cpl vmx e st tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_ 1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowpre fetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpi d ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi 2 erms invpcid mpx rdseed adx smap clflushopt i ntel_pt xsaveopt xsavec xgetbv1 xsaves dtherm i da arat pln pts hwp hwp_notify hwp_act_window h wp_epp md_clear flush_l1d

Results from <<conda list>>
Name Version Build Channel
_tflow_select 2.3.0 eigen
absl-py 0.13.0 py38h06a4308_0
aiohttp 3.7.4 py38h27cfd23_1
astor 0.8.1 py38h06a4308_0
astunparse 1.6.3 py_0
async-timeout 3.0.1 py38h06a4308_0
attrs 20.2.0 py_0 anaconda
beautifulsoup4 4.9.3 pyhb0f4dca_0 anaconda
blas 1.0 mkl anaconda
blinker 1.4 py38h06a4308_0
brotlipy 0.7.0 py38h7b6447c_1000 anaconda
bzip2 1.0.8 h7b6447c_0 anaconda
c-ares 1.17.1 h27cfd23_0
ca-certificates 2021.7.5 h06a4308_1
cachetools 4.2.2 pyhd3eb1b0_0
certifi 2021.5.30 py38h06a4308_0
cffi 1.14.3 py38he30daa8_0 anaconda
chardet 3.0.4 py38_1003 anaconda
click 8.0.1 pyhd3eb1b0_0
cmake 3.18.2 ha30ef3c_0 anaconda
conda 4.9.0 py38_0 anaconda
conda-build 3.21.4 py38h06a4308_0
conda-package-handling 1.7.2 py38h03888b9_0 anaconda
coverage 5.5 py38h27cfd23_2
cryptography 3.1.1 py38h1ba5d50_0 anaconda
cycler 0.10.0 py38_0
cython 0.29.21 py38he6710b0_0 anaconda
dbus 1.13.18 hb2f20db_0
expat 2.2.10 he6710b0_2 anaconda
filelock 3.0.12 py_0 anaconda
fontconfig 2.13.1 h6c09931_0
freetype 2.10.4 h5ab3b9f_0 anaconda
gast 0.3.3 py_0
git 2.23.0 pl526hacde149_0 anaconda
glib 2.68.1 h36276a3_0
glob2 0.7 py_0 anaconda
google-auth 1.28.0 pyhd3eb1b0_0
google-auth-oauthlib 0.4.2 pyhd3eb1b0_2
google-pasta 0.2.0 py_0
grpcio 1.36.1 py38h2157cd5_1
gst-plugins-base 1.14.0 h8213a91_2
gstreamer 1.14.0 h28cd5cc_2
h5py 2.10.0 py38hd6299e0_1
hdf5 1.10.6 hb1b8bf9_0
huggingface-hub 0.0.12 pypi_0 pypi
icu 58.2 he6710b0_3 anaconda
idna 2.10 py_0 anaconda
importlib-metadata 3.10.0 py38h06a4308_0
iniconfig 1.1.1 py_0 anaconda
intel-openmp 2020.2 254 anaconda
jinja2 2.11.2 py_0 anaconda
joblib 1.0.1 pypi_0 pypi
jpeg 9d h36c2ea0_0 conda-forge
keras 2.4.3 0
keras-base 2.4.3 py_0
keras-preprocessing 1.1.2 pyhd3eb1b0_0
kiwisolver 1.3.1 py38h2531618_0
krb5 1.18.2 h173b8e3_0 anaconda
lcms2 2.11 h396b838_0 anaconda
ld_impl_linux-64 2.33.1 h53a641e_7 anaconda
libarchive 3.4.2 h62408e4_0
libcurl 7.71.1 h20c2e04_1 anaconda
libedit 3.1.20191231 h14c3975_1 anaconda
libffi 3.3 he6710b0_2 anaconda
libgcc-ng 9.1.0 hdf63c60_0 anaconda
libgfortran-ng 7.3.0 hdf63c60_0 anaconda
liblief 0.10.1 he6710b0_0 anaconda
libllvm10 10.0.0 h4a3c616_1 anaconda
libpng 1.6.37 hbc83047_0 anaconda
libprotobuf 3.14.0 h8c45485_0
libssh2 1.9.0 h1ba5d50_1 anaconda
libstdcxx-ng 9.1.0 hdf63c60_0 anaconda
libtiff 4.1.0 h4f3a223_6 conda-forge
libuuid 1.0.3 h1bed415_2
libuv 1.40.0 h7b6447c_0 anaconda
libwebp-base 1.1.0 h7b6447c_3 anaconda
libxcb 1.14 h7b6447c_0
libxml2 2.9.10 hb55368b_3 anaconda
llvm-tools 10.0.0 h4a3c616_1 anaconda
llvmdev 10.0.0 h4a3c616_1 anaconda
lz4-c 1.9.2 heb0550a_3 anaconda
lzo 2.10 h7b6447c_2 anaconda
make 4.2.1 h1bed415_1 anaconda
markdown 3.3.4 py38h06a4308_0
markupsafe 1.1.1 py38h7b6447c_0 anaconda
matplotlib 3.3.4 py38h06a4308_0
matplotlib-base 3.3.4 py38h62a2d02_0
mkl 2019.4 243 anaconda
mkl-service 2.3.0 py38he904b0f_0 anaconda
mkl_fft 1.2.0 py38h23d657b_0 anaconda
mkl_random 1.1.0 py38h962f231_0 anaconda
more-itertools 8.5.0 py_0 anaconda
multidict 5.1.0 py38h27cfd23_2
ncurses 6.2 he6710b0_1 anaconda
numpy 1.19.1 py38hbc911f0_0 anaconda
numpy-base 1.19.1 py38hfa32c7d_0 anaconda
oauthlib 3.1.0 py_0
olefile 0.46 py_0 anaconda
openssl 1.1.1k h27cfd23_0
opt_einsum 3.3.0 pyhd3eb1b0_1
packaging 20.4 py_0 anaconda
patchelf 0.12 he6710b0_0 anaconda
pcre 8.44 he6710b0_0 anaconda
perl 5.26.2 h14c3975_0 anaconda
pillow 8.0.0 py38h9a89aac_0 anaconda
pip 20.2.4 py38_0 anaconda
pkginfo 1.6.0 py38_0 anaconda
pluggy 0.13.1 py38_0 anaconda
protobuf 3.14.0 py38h2531618_1
psutil 5.7.2 py38h7b6447c_0 anaconda
py 1.9.0 py_0 anaconda
py-lief 0.10.1 py38h403a769_0 anaconda
pyasn1 0.4.8 py_0
pyasn1-modules 0.2.8 py_0
pycosat 0.6.3 py38h7b6447c_1 anaconda
pycparser 2.20 py_2 anaconda
pyjwt 1.7.1 py38_0
pyopenssl 19.1.0 py_1 anaconda
pyparsing 2.4.7 py_0 anaconda
pyqt 5.9.2 py38h05f1152_4
pysocks 1.7.1 py38_0 anaconda
pytest 6.1.1 py38_0 anaconda
python 3.8.5 h7579374_1 anaconda
python-dateutil 2.8.1 pyhd3eb1b0_0
python-libarchive-c 2.9 py_0 anaconda
pytz 2020.1 py_0 anaconda
pyyaml 5.3.1 py38h7b6447c_1 anaconda
qt 5.9.7 h5867ecd_1
readline 8.0 h7b6447c_0 anaconda
regex 2021.7.6 pypi_0 pypi
requests 2.24.0 py_0 anaconda
requests-oauthlib 1.3.0 py_0
rhash 1.4.0 h1ba5d50_0 anaconda
ripgrep 12.1.1 0 anaconda
rsa 4.7.2 pyhd3eb1b0_1
ruamel_yaml 0.15.87 py38h7b6447c_1 anaconda
sacremoses 0.0.45 pypi_0 pypi
scipy 1.5.2 py38h0b6359f_0 anaconda
setuptools 50.3.0 py38hb0f4dca_1 anaconda
sip 4.19.13 py38he6710b0_0
six 1.15.0 py_0 anaconda
soupsieve 2.0.1 py_0 anaconda
sqlite 3.33.0 h62c20be_0 anaconda
tensorboard 2.4.0 pyhc547734_0
tensorboard-plugin-wit 1.6.0 py_0
tensorflow 2.3.0 eigen_py38h71ff20e_0
tensorflow-base 2.3.0 eigen_py38hb57a387_0
tensorflow-estimator 2.5.0 pyh7b7c402_0
termcolor 1.1.0 py38h06a4308_1
tk 8.6.10 hbc83047_0 anaconda
tokenizers 0.10.3 pypi_0 pypi
toml 0.10.1 py_0 anaconda
tornado 6.1 py38h27cfd23_0
tqdm 4.50.2 py_0 anaconda
transformers 4.8.2 pypi_0 pypi
typing-extensions 3.10.0.0 hd3eb1b0_0
typing_extensions 3.10.0.0 pyh06a4308_0
urllib3 1.25.11 py_0 anaconda
werkzeug 1.0.1 pyhd3eb1b0_0
wheel 0.35.1 py_0 anaconda
wrapt 1.12.1 py38h7b6447c_1
xz 5.2.5 h7b6447c_0 anaconda
yaml 0.2.5 h7b6447c_0 anaconda
yarl 1.6.3 py38h27cfd23_0
zipp 3.5.0 pyhd3eb1b0_0
zlib 1.2.11 h7b6447c_3 anaconda
zstd 1.4.5 h9ceee32_0 anaconda
Results from <<pip list>>
Package Version
absl-py 0.13.0
aiohttp 3.7.4
astor 0.8.1
astunparse 1.6.3
async-timeout 3.0.1
attrs 20.2.0
beautifulsoup4 4.9.3
blinker 1.4
brotlipy 0.7.0
cachetools 4.2.2
certifi 2021.5.30
cffi 1.14.3
chardet 3.0.4
click 8.0.1
cloudpickle 1.6.0
conda 4.9.0
conda-build 3.21.4
conda-package-handling 1.7.2
coverage 5.5
cryptography 3.1.1
cycler 0.10.0
Cython 0.29.21
decorator 5.0.9
filelock 3.0.12
gast 0.3.3
glob2 0.7
google-auth 1.28.0
google-auth-oauthlib 0.4.2
google-pasta 0.2.0
grpcio 1.36.1
h5py 2.10.0
huggingface-hub 0.0.12
idna 2.10
importlib-metadata 3.10.0
iniconfig 1.1.1
Jinja2 2.11.2
joblib 1.0.1
Keras 2.4.3
Keras-Preprocessing 1.1.2
kiwisolver 1.3.1
libarchive-c 2.9
Markdown 3.3.4
MarkupSafe 1.1.1
matplotlib 3.3.4
mkl-fft 1.2.0
mkl-random 1.1.0
mkl-service 2.3.0
more-itertools 8.5.0
multidict 5.1.0
numpy 1.19.1
oauthlib 3.1.0
olefile 0.46
opt-einsum 3.3.0
packaging 20.4
Pillow 8.0.0
pip 20.2.4
pkginfo 1.6.0
pluggy 0.13.1
protobuf 3.14.0
psutil 5.7.2
py 1.9.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycosat 0.6.3
pycparser 2.20
PyJWT 1.7.1
pyOpenSSL 19.1.0
pyparsing 2.4.7
PySocks 1.7.1
pytest 0.0.0
python-dateutil 2.8.1
pytz 2020.1
PyYAML 5.3.1
regex 2021.7.6
requests 2.24.0
requests-oauthlib 1.3.0
rsa 4.7.2
ruamel-yaml 0.15.87
sacremoses 0.0.45
scipy 1.5.2
setuptools 50.3.0.post20201006
sip 4.19.13
six 1.15.0
soupsieve 2.0.1
synr 0.3
tensorboard 2.4.0
tensorboard-plugin-wit 1.6.0
tensorflow 2.3.0
tensorflow-estimator 2.5.0
termcolor 1.1.0
tokenizers 0.10.3
toml 0.10.1
tornado 6.1
tqdm 4.50.2
transformers 4.8.2
tvm 0.8.dev1300+g3a9a38822
typing-extensions 3.10.0.0
urllib3 1.25.11
Werkzeug 1.0.1
wheel 0.35.1
wrapt 1.12.1
yarl 1.6.3
zipp 3.5.0

You need to tune you model. See Performance regression of sparse BERT example