Memory Leak in Debug Executor


I am trying to benchmark a large number of individual layer workloads using the debug runtime. To achieve more accurate results, I am executing the run function in a loop and up to 50 times.

However, I realized that every iteration of caused additional memory to be allocated. This memory is never freed, neither when destroying the module nor when recompiling it, only restarting the Python kernel frees ist.

The same error does not occur, when I am using the profile function with the PAPI backend.

minimal example:

from tvm.contrib.debugger import debug_executor as graph_runtime
import os
import psutil

mod =
params = {}
with tvm.transform.PassContext(opt_level=3):
    compiled_graph_lib =, target_class, params=params)
## building runtime
debug_g_mod = graph_runtime.GraphModuleDebug(
    compiled_graph_lib["debug_create"]("default", dev),

for r in range(0, runs):
    # run debug runtime with time measurements only
    print(r+1, psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2)

Should I just use profile instead or will the time measurement be less precise with that function? Used TVM Version: 0.8.dev0

Hey Max can you open a github issue? Also can’t seem to run your minimal example.

One thing to try if you have time is to use a dev install from source of TVM to see if this is still an issue.