SGX timing benchmark is too slow

Hi all, I am trying to benchmark TVM running inside SGX Enclave. I am using the built-in ‘tvm.relay.testing’ models like: mobilenet or squeezenet and running the SGX example. The output tensor is OK (shape and values) but it is too slow. When I ran these models in standard TVM execution (Python) I got a couple of milliseconds latency, but inside the enclave, I got close to one second :frowning: . The time measurement I checked was before and after exec.run() command in main.rs file. Moreover, when I try to run inference again, I found out that the latency is getting bigger at each iteration. I noticed that inside the SGX enclave only 1 CPU is busy while in the regular TVM (python) all my CPUs were busy. weird.
I tried to play with stack/heap/threads configuration in the Cargo.tomel file with no luck. I tried to use models with autotune and the improvement was still far away.

Is someone try this before? Any suggestion for help will be appreciated :smile: -)

Elad