Hi all,
I am trying to benchmark TVM running inside SGX Enclave.
I am using the built-in ‘tvm.relay.testing’ models like: mobilenet or squeezenet and running the SGX example. The output tensor is OK (shape and values) but it is too slow. When I ran these models in standard TVM execution (Python) I got a couple of milliseconds latency, but inside the enclave, I got close to one second .
The time measurement I checked was before and after exec.run() command in main.rs file.
Moreover, when I try to run inference again, I found out that the latency is getting bigger at each iteration. I noticed that inside the SGX enclave only 1 CPU is busy while in the regular TVM (python) all my CPUs were busy. weird.
I tried to play with stack/heap/threads configuration in the Cargo.tomel file with no luck.
I tried to use models with autotune and the improvement was still far away.
Is someone try this before? Any suggestion for help will be appreciated -)
Elad