Error with Relax while build vicuna-7b with mlc-llm build.py:Module stackvm should be either dso exportable or binary serializable

anders · June 9, 2023, 1:49am

I got “Module stackvm should be either dso exportable or binary serializable.” when run vicuna-7b-delta-v0 with mlc_chat_cli，using relax repo with mlc branch(hash:6fd55bc).

cd mlc-llm/dist/models git clone lmsys/vicuna-7b-delta-v0 · Hugging Face

//error while compile model: cd mlc-llm python3 build.py --hf-path vicuna-7b-delta-v0 --target auto --quantization q3f16_0

Any body have idea? hope for the best.

anders · June 9, 2023, 1:51am

and this are some error logs:

Weights exist at dist/models/vicuna-7b-delta-v0, skipping download. Using path “dist/models/vicuna-7b-delta-v0” for model “vicuna-7b-delta-v0” Database paths: [‘log_db/dolly-v2-3b’, ‘log_db/redpajama-3b-q4f16’, ‘log_db/redpajama-3b-q4f32’, ‘log_db/rwkv-raven-1b5’, ‘log_db/rwkv-raven-3b’, ‘log_db/rwkv-raven-7b’, ‘log_db/vicuna-v1-7b’] Target configured: cuda -keys=cuda,gpu -arch=sm_70 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32 Automatically using target for weight quantization: cuda -keys=cuda,gpu -arch=sm_70 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32 Traceback (most recent call last): File “build.py”, line 420, in main() File “build.py”, line 398, in main mod = mod_transform_before_build(mod, params, ARGS) File “build.py”, line 281, in mod_transform_before_build new_params = utils.transform_params(mod_transform, model_params, args) File “/home/repo/mlc-llm/mlc_llm/utils.py”, line 255, in transform_params vm = relax.vm.VirtualMachine(ex, device) File “/home//repo/tvm-unity/python/tvm/runtime/relax_vm.py”, line 81, in init rt_mod = rt_mod.jit() File “/home//repo/tvm-unity/python/tvm/relax/vm_build.py”, line 89, in jit not_runnable_list = self.mod._collect_from_import_tree(_not_runnable) File “/home/repo/tvm-unity/python/tvm/runtime/module.py”, line 426, in _collect_from_import_tree assert ( AssertionError: Module stackvm should be either dso exportable or binary serializable.

Hzfengsy · June 9, 2023, 3:06am

vicuna-7b-delta-v0 is only a DELTA weight based on LLAMA, there is a statement in the HF repo:

NOTE: This “delta model” cannot be used directly. Users have to apply it on top of the original LLaMA weights to get actual Vicuna weights. See GitHub - lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. for instructions.

yzh119 · July 3, 2023, 1:55pm

The error " AssertionError: Module stackvm should be either dso exportable or binary serializable." happens when you didn’t compile TVM with LLVM enabled (and in this case, stackvm interpreter is used as the host target).

You can try our pre-built wheels: MLC | MLC Packages where LLVM is enabled by default.

ark_knight · October 10, 2023, 4:39am

Hi yzh119, I am getting the same error

AssertionError: Module stackvm should be either dso exportable or binary serializable.
image1332×509 89.1 KB

even after installing the pre-built wheels as you suggested , can you please help me resolve this issue.

I am using a finetuned llama2 chat model.

junrushao · October 11, 2023, 2:42am

@ark_knight Could you share the output of our validation step here: Install TVM Unity Compiler — mlc-llm 0.1.0 documentation? Particularly from step 3

wangxun · October 13, 2023, 11:07am

I also got AssertionError: Module stackvm should be either dso exportable or binary serializable.And this are some error logs:

Any body have idea? hope for the best.

ark_knight · October 13, 2023, 5:19pm

Sure , PFA the screenshot 1 of the output of validation step

ark_knight · October 13, 2023, 5:20pm

Screenshot 2 of the output of the validation step

junrushao · October 13, 2023, 7:39pm

As you may tell, USE_LLVM is set to OFF which means you didn’t compile TVM with LLVM. If you really want to build TVM from source, please follow the instructions here: Install TVM Unity Compiler — mlc-llm 0.1.0 documentation

anders · October 16, 2023, 1:52am

现在版本的TVM，在你实现的XXXModule里需要再添加个接口(原来是不需要加的)，就不会报错了，基本如下:

  class XXXModuleNode : public runtime::ModuleNode {
    ...
    /*! \brief Get the property of the runtime module .*/
    int GetPropertyMask() const final {
      return ModulePropertyMask::kBinarySerializable | ModulePropertyMask::kRunnable;
    }
    ...
  }

Lanssi · October 26, 2023, 2:21pm

Yes, I followed the instructions to install the mlc-ai-nightly, but still got the message that “USE_LLVM:OFF” and “LLVM_VERSION: NOT_FOUND”. I’d like to ask that if any dependency is needed before install the prebuilt_tvm. For example, llvm, zlib, zstd or something else. I’ll greately appreciate your help!