Irrespective of input same output iOS TVM model

myproject24 · August 11, 2021, 6:30am

Hi TVM Team,

@tqchen @preethireddy @shivmgg @echuraev I am developing an iOS TVM application but Irrespective of input same inference output I am getting in the iOS TVM model.

Converted TensorFlow model to TVM model - success
Tried to inference converted x86 TVM model in python - success
Again converted TF model to TVM model of iOS ARM64 - success
Tried to inference converted arm64 model in iOS - Irrespective of input same output not giving expected inference output.

Below are the target and target_host: target = ‘metal’ target_host = “llvm -mtriple=arm64-apple-darwin”

Note: If I set all input values to zero it is giving the same inference output.

What will be the issue kindly help me to fix this issue.

Thank you.

elvin-n · August 11, 2021, 11:09am

Ho do you export model? Do you point iphone sdk like

from tvm.contrib import xcode
arch = "arm64"
sdk = "iphoneos"
libpath = model_folder_path + "/" + model_name + ".dylib"
lib.export_library(libpath, xcode.create_dylib, arch=arch, sdk=sdk)

Tried to inference converted arm64 model in iOS - Irrespective of input same output not giving expected inference output. Do you do this through RPC or build in TVM/model to your application? If you do natively and run your own app on iOS, how do you feed data? Could you share this part of code?

myproject24 · August 16, 2021, 6:03am

Hi @elvin-n i have built a standalone iOS application

For iOS CPU.

sdk = “iphoneos”
target = “llvm -mtriple=arm64-apple-darwin”

For iOS Metal:

sdk = “iphoneos”
target = “metal”
target_host = “llvm -mtriple=arm64-apple-darwin”

And i used c++ to inference the TVM model

// load module tvm::runtime::Module mod_syslib = tvm::runtime::Module::LoadFromFile(std_resourcePath);

//load graph
std::ifstream json_in(std_jsonPath);
std::string json_data((std::istreambuf_iterator<char>(json_in)), std::istreambuf_iterator<char>());
json_in.close();

// get global function module for graph runtime
auto tvm_graph_runtime_create = tvm::runtime::Registry::Get("tvm.graph_executor.create");
tvm::runtime::Module mod = (*tvm_graph_runtime_create)(json_data, mod_syslib, device_type, device_id);
this->m_handle = new tvm::runtime::Module(mod);

//parameters needs to be TVMByteArray typr to indecate the binary data
std::ifstream params_in(std_paramPath, std::ios::binary);
std::string params_data((std::istreambuf_iterator<char>(params_in)), std::istreambuf_iterator<char>());
params_in.close();
TVMByteArray params;
params.data = params_data.c_str();
params.size = params_data.length();
mod.GetFunction("load_params")(params);

tvm::runtime::PackedFunc set_input = mod->GetFunction("set_input");
set_input("input", m_gpuInput);

tvm::runtime::PackedFunc run = mod->GetFunction("run");
run();
tvm::runtime::PackedFunc get_output = mod->GetFunction("get_output");

elvin-n · August 16, 2021, 6:34am

which context do you use for creation of m_gpuInput and how do you populate data?

BTW, you are loading params and json that means that you are using packed zip format for storing model. Is it more convinient for you? Does it find artefacts during the loading of model? I prefer to use everything been packed to one library created by lib.export_library

myproject24 · August 16, 2021, 7:40am

m_gpuInput is in metal context kDLMetal

and I bundled JSON, Params, and dylib separately with the application I feel it is more convenient to understand if any issue.

elvin-n · August 16, 2021, 7:56am

How do you populate data to this NDArray?

myproject24 · August 16, 2021, 8:21am

//data is float* 
TVMArrayCopyFromBytes(m_cpuInput, data, w*h*c);
TVMArrayCopyFromTo(m_cpuInput, m_gpuInput, nullptr);
set_input("INPUT", m_gpuInput);

get_output(0, m_gpuOutput0);
TVMArrayCopyFromTo(m_gpuOutput0, m_cpuOutput0, nullptr);

elvin-n · August 16, 2021, 7:19pm

missing of *4 - is it a problem of copy-past or you forgot to compensate the size of the float? since TVMArrayCopyFromBytes deal with bytes, not floats.

Another question why do you need m_gpuInput? you can use only NDArray been allocated with CPU context. Copy will happen automatically inside the TVM runtime.

myproject24 · August 17, 2021, 2:49am

Yes, it is WHC*4.
And my understanding is we can’t directly access GPU memory so i copied all my input data to CPU memory and then copied to GPU memory please excuse me if my understanding was wrong.

elvin-n · August 17, 2021, 11:58am

in MetalWorkspace::CopyDataFromTo three situation is handled:

Copy from Metal to Metal
Copy from CPU to Metal
Copy from Metal to CPU I.e. in your case one more extra copy will be done and it will be much easier currently just to pass NDArray working with CPU context. Could you pass m_cpuInput to set_input("INPUT", m_cpuInput); and verify the result?

Another difference - I looked in my code and figured out that I have not used TVMArrayCopyFromTo, but used function of NDArray. Like

tvm::runtime::NDArray output = getOutput_(0);
output.CopyTo(y_);

where output will be NDArray working with Metal context and y_ - NDArray allocated for CPU context

elvin-n · August 19, 2021, 6:35am

@myproject24 did you have a chance to verify if passing of CPU NDArray to input and using of the other API for copy data helps to solve the problem or not?

myproject24 · August 19, 2021, 9:50am

Sorry i didn’t get time to check i will let you know once i verify it.