Can not run the model with Vulkan backend on Android

I’m not familiar with android toolchain, but this README tvm/README.md at main · apache/tvm · GitHub suggests android-24 is the right one?

android-24 is version of SDK platform, but I need to use .cmake from ndk

Sorry, I will investigate that topic and post correct result of my work

After my small investigation I decide to use the next command:

make -DCMAKE_TOOLCHAIN_FILE=$NDK/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=android-24 -DANDROID_NATIVE_API_LEVEL=23 -DANDROID_ARM_NEON=ON ..

but with small refactoring. It found Vulkan_LIBRARY by the path

Vulkan_LIBRARY=/home/whoami/android-ndk-r21e/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/lib/aarch64-linux-android/24/libvulkan.so

so it means, that I need to set others paths in findVulkan.cmake manually to the downloaded VulkanSDK. It seems like:

Vulkan_SPIRV_TOOLS_LIBRARY=/whoami/vulkan/1.2.162.1/x86_64/lib/libSPIRV-Tools.a

and

Vulkan_INCLUDE_DIRS=/home/whoami/vulkan/1.2.162.1/x86_64/include/home/whoami/vulkan/1.2.162.1/x86_64/include/spirv-tools/home/whoami/vulkan/1.2.162.1/x86_64/include/spirv/unified1/home/whoami/vulkan/1.2.162.1/x86_64/include/spirv/unified1

Conclusion of all investigation is - tvm_runtime compilation for android should use libvulkan.so from android_ndk folder.

Dear @masahi, could you please me one more time, I successfully compile tvm_runtime with Vulkan and my model with Vulkan for Android deployment. But when I launch the program in runtime on Android device it crashes with the next log:

E/libc++abi: terminating with uncaught exception of type tvm::runtime::InternalError: [14:09:13] /home/whoami/tvm/src/runtime/vulkan/vulkan_stream.cc:137: 
    ---------------------------------------------------------------
    An error occurred during the execution of TVM.
    For more information, please see: https://tvm.apache.org/docs/errors.html
    ---------------------------------------------------------------
      Check failed: (__e == VK_SUCCESS) is false: Vulkan Error, code=-4: VK_ERROR_DEVICE_LOST
    Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.  

It occurs during the execution of next line of the code:

m_output.CopyToBytes(*data, size);

All my code on the next sample;

bool TVMhandler::initModel(AAssetManager * assetManager, const char* modelName) {

    Device dev {kDLVulkan, device_id};
    LOGE("before module");   
    m_mod = tvm::runtime::Module::LoadFromFile(modelName);
    LOGE("After module");
    Module def = m_mod.GetFunction("default")(dev);
    m_inputFunc = def.GetFunction("set_input");
    m_runFunc = def.GetFunction("run");
    m_outputFunc = def.GetFunction("get_output");

    m_input = NDArray::Empty({1, 3, 256, 256}, DLDataType{kDLFloat, 32, 1}, dev);
    m_output = NDArray::Empty({1, 209, 64, 64}, DLDataType{kDLFloat, 32, 1}, dev);

    LOGE("End of the init");
    return true;
}

bool TVMhandler::track(cv::Mat &input, float **data, int &size, int **shape) {

    cv::Mat blob = cv::dnn::blobFromImage(input, 1.0 / 255.0);

    m_input.CopyFromBytes(blob.data, blob.total()*blob.elemSize());

    m_inputFunc("input", m_input);
    LOGE("Before run");
    m_runFunc();
    LOGE("AFTER run");
    m_outputFunc(1, m_output);
    LOGE("After output");
    *shape = new int[4];

    (*shape)[0] = m_output->shape[0];
    (*shape)[1] =  m_output->shape[1];
    (*shape)[2] = m_output->shape[2];
    (*shape)[3] = m_output->shape[3];

    size = m_output->shape[0] * m_output->shape[1] * m_output->shape[2] * m_output->shape[3] * sizeof(float);
    *data = new float[size];
    LOGE("copy data");
    m_output.CopyToBytes(*data, size);
    LOGE("end of the queu");
    return true;
}

I would like to notice that above code works successfully with tvm_runtime and model were built with CPU, (of course instead of kDLVulkan - I use kDLCPU)

Could you give me please your suggestion, why it doesnt work. I will be very grateful.

I think it should be m_output.CopyToBytes(*data, sizeof(float) * size);

Look please into the code, I already made multiplication with a sizeof(float)

 size = m_output->shape[0] * m_output->shape[1] * m_output->shape[2] * m_output->shape[3] * sizeof(float);
    *data = new float[size];
    LOGE("copy data");
    m_output.CopyToBytes(*data, size);

Additional information. I have looked into vulkan_stream.cc, place, where error was occured is

VULKAN_CALL(vkQueueSubmit(vctx_->queue, 1, &cb_submit, state_->fence_));

Maybe I need to set additional parameters, which for CPU version is not necessary?

Hello @masahi, could you tell me please, how to set TVM_VULKAN_ENABLE_VALIDATION_LAYERS to enable validation layer in Vulkan for debugging the problem above?

Hi @WildTaras, a bit late, and it looks like you may have a working configuration, however replying nonetheless, here are the relevant lines from my cmake.config for cross compiling to android libtvm_runtime.so with OpenCL support.

set(USE_OPENCL /home/csullivan/octoml/android_libs)
set(USE_CPP_RPC ON)
set(CMAKE_TOOLCHAIN_FILE $ENV{ANDROID_NDK}/build/cmake/android.toolchain.cmake)
set(ANDROID_ABI "arm64-v8a")
set(ANDROID_PLATFORM android-28)
message(STATUS "Using toolchain file: ${CMAKE_TOOLCHAIN_FILE}.")

You mentioned above some confusion on which libvulkan.so to use when linking the cross compiled binary. I suspect you’ll want to use adb to extract the libvulkan.so from your device as described here. Not sure, but it is possible that the VK_ERROR_DEVICE_LOST error you see is the result of differences in the vulkan lib you linked against at compile time and that which is on the device.

1 Like

I think you can use export TVM_VULKAN_ENABLE_VALIDATION_LAYERS=1.

@csullivan Thank you, for your response. But when I took the libvulkan.so and compiled tvm_runtime - all the same, it gives me the error VK_ERROR_DEVICE_LOST. I would like to admit, that I have run tvm_runtime with openCL optimization for checking my way of using tvm library. It works! Tvm_runtime with opencl library works well. Do you have any additional suggestions?

Hi @WildTaras , can you please share the details about building the TVM with vulkan on android. I am getting the SPIRV related errors. Can you also share the tvm config file for building

Can you please point me the process of building tvm with vulkan for android?

I am facing multiple redefinition errors as well.