Can someone please give me the steps to use PT_TVMDSOOP?

hirohaku · April 11, 2022, 2:57am

Hi, I would like to use PT_TVMDSOOP for TVM.

Can someone please tell me the steps to use PT_TVMDSOOP for me?

I success to build tvm and pytorch. However, when I run the python file in apps/pt_tvmdsoop/tests/, an error occurs. I cannot import tvm.contrib.torch in common.

If you can try my situation, please use my docker image or following below command. https://hub.docker.com/repository/docker/hirohaku21/hirohaku_tvm

docker run --name "hirohakuTVM" -it --gpus 0 hirohaku21/hirohaku_tvm:0.2 /bin/bash

This is Error message

(tvm) root@c225d29abac9:~/tvm/apps/pt_tvmdsoop/tests# python test_torch_graph_module.py 
Traceback (most recent call last):
  File "test_torch_graph_module.py", line 28, in <module>
    import tvm.contrib.torch
  File "/root/tvm/python/tvm/contrib/torch/__init__.py", line 40, in <module>
    _load_platform_specific_library()
  File "/root/tvm/python/tvm/contrib/torch/__init__.py", line 37, in _load_platform_specific_library
    torch.classes.load_library(lib_file_path)
  File "/root/miniconda3/envs/tvm/lib/python3.7/site-packages/torch/_classes.py", line 48, in load_library
    torch.ops.load_library(path)
  File "/root/miniconda3/envs/tvm/lib/python3.7/site-packages/torch/_ops.py", line 244, in load_library
    ctypes.CDLL(path)
  File "/root/miniconda3/envs/tvm/lib/python3.7/ctypes/__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /root/tvm/build/libpt_tvmdsoop.so: undefined symbol: _ZTIN3c104TypeEE   

OR
OSError: /root/tvm/build/libpt_tvmdsoop.so: undefined symbol: _ZN3tvm7runtime8Registry3GetERKSs

OR 
E   OSError: /root/tvm/build/libpt_tvmdsoop.so: undefined symbol: _znk3c104type14issubtypeofexterkst10shared_ptris0_epso

Here is how to reproduce it.

I used nvidia-docker2.

Creating a docker container.

docker run --name "MYTVM" -it --gpus 0 nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04 /bin/bash

Install the necessary tools with apt.

apt update
apt-get install -y wget curl git wget vim python3 python3-dev python3-setuptools gcc libtinfo-dev zlib1g-dev build-essential cmake libedit-dev libxml2-dev

Create and ACTIVATE a virtual environment with MINICONDA.

cd $HOME
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
source ~/.bashrc
conda create -n tvm python==3.7.10
conda activate tvm

Building pytorch v1.12.0.

conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses
conda install -c pytorch magma-cuda110
cd $HOME
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
adding this sentence in CMakeLists.txt↓
add_definitions(-D_GLIBCXX_USE_CXX11_ABI=0)
python setup.py install
python setup.py develop && python -c "import torch"

Download LLVM

cd $HOME
wget https://github.com/llvm/llvm-project/releases/download/llvmorg-12.0.0/clang+llvm-12.0.0-x86_64-linux-gnu-ubuntu-20.04.tar.xz
tar -Jxvf clang+llvm-12.0.0-x86_64-linux-gnu-ubuntu-20.04.tar.xz

Building TVM v0.8.0

cd $HOME
git clone --recursive https://github.com/apache/tvm.git -b v0.8.0
cd tvm
mkdir build
cp cmake/config.cmake build/
cd build

I set config.cmake like this

set(USE_CUDA ON)
set(USE_ROCM OFF)
set(USE_SDACCEL OFF)
set(USE_AOCL OFF)
set(USE_OPENCL OFF)
set(USE_METAL OFF)
set(USE_VULKAN OFF)
set(USE_OPENGL OFF)
set(USE_MICRO OFF)
set(USE_RPC ON)
set(USE_CPP_RPC OFF)
set(USE_IOS_RPC OFF)
set(USE_STACKVM_RUNTIME OFF)
set(USE_GRAPH_EXECUTOR ON)
set(USE_GRAPH_EXECUTOR_CUDA_GRAPH OFF)
set(USE_PIPELINE_EXECUTOR OFF)
set(USE_PROFILER ON)
set(USE_MICRO_STANDALONE_RUNTIME OFF)
set(USE_LLVM "/root/clang+llvm-12.0.0-x86_64-linux-gnu-ubuntu-20.04/bin/llvm-config --link-static")
set(USE_BYODT_POSIT OFF)
set(USE_BLAS none)
set(USE_MKL OFF)
set(USE_MKLDNN OFF)
set(USE_OPENMP none)
set(USE_RANDOM ON)
set(USE_NNPACK OFF)
set(USE_TFLITE OFF)
set(USE_TENSORFLOW_PATH none)
set(USE_FLATBUFFERS_PATH none)
set(USE_EDGETPU OFF)
set(USE_CUDNN ON)
set(USE_CUBLAS ON)
set(USE_MIOPEN OFF)
set(USE_MPS OFF)
set(USE_ROCBLAS OFF)
set(USE_SORT ON)
set(USE_DNNL_CODEGEN OFF)
set(USE_ARM_COMPUTE_LIB OFF)
set(USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR OFF)
set(USE_ETHOSN OFF)
set(USE_ETHOSN_HW OFF)
set(USE_ETHOSU OFF)
set(USE_TENSORRT_CODEGEN OFF)
set(USE_TENSORRT_RUNTIME OFF)
set(USE_VITIS_AI OFF)
set(USE_VERILATOR OFF)
set(USE_ANTLR OFF)
set(USE_RELAY_DEBUG OFF)
set(USE_VTA_FSIM OFF)
set(USE_VTA_TSIM OFF)
set(USE_VTA_FPGA OFF)
set(USE_THRUST OFF)
set(USE_TF_TVMDSOOP OFF)
set(USE_PT_TVMDSOOP ON)
set(USE_FALLBACK_STL_MAP OFF)
set(USE_HEXAGON_DEVICE OFF)
set(USE_HEXAGON_SDK /path/to/sdk)
set(USE_HEXAGON_LAUNCHER OFF)
set(USE_HEXAGON_ARCH "v66")
set(USE_TARGET_ONNX OFF)
set(USE_BNNS OFF)
set(USE_LIBBACKTRACE AUTO)
set(BUILD_STATIC_RUNTIME OFF)
set(USE_CCACHE AUTO)
set(USE_PAPI OFF)
set(USE_GTEST AUTO)
set(USE_CUTLASS OFF)

vim $HOME/tvm/CMakeLists.txt

rewrite as 
cmake_minimum_required(VERSION 3.20)

cmake ..
make -j12

vim ~/.bashrc
export TVM_HOME=/path/to/tvm
export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
source ~/.bashrc
conda activate tvm
conda install decorator pytest scipy

ThinkANameIshard · April 13, 2022, 6:08am

Hi Hirohaku,

Thanks for reporting this issue! I was able to reproduce the “undefined symbol” error and finally managed to adddress it after a couple of trial-and-errors

Root cause

The official wheels provided by PyTorch is built with CXX11 ABI off in both pip and conda installation. To be compatible with PyTorch, TVM’s PyTorch extension uses the flag -D_GLIBCXX_USE_CXX11_ABI=0 to suppress CXX11 ABI, as indicated in its cmake.

However, the dependency of libpt_tvmdsoop, including libtvm and libLLVM, are all built with CXX11 ABI. When being linked together, the ABI mismatch will lead to undefined symbol error between TVM and PyTorch.

The ABI of PyTorch could be checked with the following commands:

import torch
print(torch.compiled_with_cxx11_abi())

Similar issues have been extensively reported in PyTorch forums, for example, Undefined symbol when import lltm cpp extension.

Solution

Long story short, ABIs need to be tweaked into all CXX11 or all non-CXX11 to avoid linking issues.

All non-CXX11. To avoid rebuilding PyTorch, one has to rebuild TVM with -D_GLIBCXX_USE_CXX11_ABI=0 as the global flag in CMake. Note that libLLVM will possibly need to be rebuilt as well for consistency.

All CXX11. In this case, it’s still possible to find a prebuilt PyTorch with CXX ABI, and by turning all CXX11 ABI on, we could get the entire flow working.

Step 1. Install PyTorch with CXX11 ABI

conda install pytorch  # Note: don't use "-c pytorch" as indicated in the official website, which is not built with CXX11 ABI
python -c "import torch; print(torch.compiled_with_cxx11_abi())" # should print "True"

Step 2. Edit TVM’s cmake file to produce CXX11-compatible libpt_tvmdsoop, on this line:

-  set(PT_COMPILE_FLAGS_STR "-I${PT_PATH}/include -D_GLIBCXX_USE_CXX11_ABI=0")
+  set(PT_COMPILE_FLAGS_STR "-I${PT_PATH}/include")

Step 3. Switch around between clang++ and g++ if symbol issue persists, by tweaking $CC and $CXX.

Summary

We will need a PyTorch build with CXX11 ABI, or building the entire system with default PyTorch without CXX11 ABI.

Please let me know if you have any questions!

hirohaku · April 18, 2022, 3:02am

Thank you for your help!
I can use PT_TVMDSOOP now!
I put docker image for someone confused about pt_tvmdsoop.

Please download images version number is 0.3.

twmht · July 26, 2023, 8:42am

@ThinkANameIshard Can I use pip to install pytorch with abi?