Add an addtional info: in multi-processed run, the performace of module.run() in each process does not drop.