Error while initializing PAPI (workaround documentation)

I was experiencing an issue with running Papi in TVM. Whenever I tried to initialize it, I got the error:

File "/home/perry/phd/tvm-power-profile/tvm/src/runtime/contrib/papi/papi.cc", line 131
TVMError: Check failed: PAPI_library_init(((((7)<<24) | ((0)<<16) | ((0)<<8) | (0)) & 0xffff0000)) == ((((7)<<24) | ((0)<<16) | ((0)<<8) | (0)) & 0xffff0000) (-1 vs. 117440512) : Error while initializing PAPI

However, I have v6.0 of PAPI installed.

I verified this by compiling and running this C++ program:

// Import and initalize the PAPI library
// g++ -std=c++11 -o papi_test papi_test.cpp -lpapi
#include <cmath>
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <fstream>
#include <iostream>
#include <papi.h>
#include <string>
#include <vector>

using namespace std;

#define NUM_EVENTS 2
#define NUM_RUNS 10

int main(int argc, char *argv[]) {
  // Initialize the PAPI library
  int retval = PAPI_library_init(PAPI_VER_CURRENT);
  if (retval != PAPI_VER_CURRENT) {
    cout << "PAPI library init error!" << endl;
    exit(1);
  }
  cout << PAPI_VER_CURRENT << endl;
}

This confirmed that I was running v6.0, since I printed 100663296 rather than 117440512. I tried clearing my build directory of TVM and compiling again from scratch. The cmake process confirmed that it had found PAPI v6.0.

However, the error happened again. My workaround is to edit the line of papi.cc in TVM, so that v6.0 is hardcoded:

CHECK_EQ(PAPI_library_init(100663296), 100663296)
          << "Error while initializing PAPI";

This could be an issue with a partially broken PAPI install on my machine, but all of the libraries and headers appear to be the right version. I have my work around, so I’m just documenting this for anyone else searching for the error.

As I understand it, PAPI_VER_CURRENT is defined in papi.h, and there is only one version of that on my system according to $ locate papi.h.

I was ultimately unable to run CUDA counters with this setup.

I moved to a Docker container, with permissive access to hardware counters. For some reason, in this setup, the recommended PAPI version papi-6-0-0-1-t was unable to build with support for my system, so I have tried papi-7-0-1-t.

This works, but on the TVM side when I try to profile I get the error:

  File "/tvm/src/target/opt/build_cuda_on.cc", line 147
TVMError: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (calling_conv == CallingConv::kDeviceKernelLaunch) is false: CodeGenCUDA: expect calling_conv equals CallingConv::kDeviceKernelLaunch

If I insert a basic print statement at the line in the file in question, I get the output:

2
2
2
2
2
1

As the value of calling_conv on each invocation before the crash.

My profiling code is this:

import tvm
from tvm import relay
from tvm.relay.testing import mlp
from tvm.runtime import profiler_vm
import numpy as np

# target = "llvm"
# target_host = "llvm"
# dev = tvm.cpu()

target = "cuda"
target_host = "llvm"
dev = tvm.cuda(0)

mod, params = mlp.get_workload(1)

exe = relay.vm.compile(mod, target, target_host=target, params=params)


data = tvm.nd.array(np.random.rand(1, 1, 28, 28).astype("float32"), device=dev)
vm = profiler_vm.VirtualMachineProfiler(exe, dev)
report = vm.profile(
    data,
    func_name="main",
    collectors=[
        tvm.runtime.profiling.PAPIMetricCollector(
            {dev: ["nvml:::NVIDIA_GeForce_RTX_3090:device_1:power"]}
        )
    ],
)

print(report)

I’m going to try running on a different host, since although in a container I was able to bypass some issues of the host, since we’re dealing with stuff close to the kernel I might be hitting some of the same issues.