Hi, I am trying to offload the supported part of a MobileNet-V2 model to Arm Compute Library with partition_for_arm_compute_lib(mod) provided by the ACL BYOC pipeline.
However, it seems when building the partitioned graph with relay.build, the layout transform part required for ACL operators is integrated into the sub-graphs that are annotated with ‘arm_compute_lib’, and these transform ops aren’t folded away as expected.
Since ACL runtime engine can’t recognize this layout_transform op, I get the unsupported op error when trying to run the lib. Any suggestions on resolving this issue? Or is there anything I did wrong?
Hi @mgeek, it seems as though the constants aren’t bound to the function you are running and are instead being treated similar to a variable input. For this reason constant folding will not work and this is why the layout_transform operator remains. After importing your model from a frontend, but before partitioning for ACL, I would suggest running the bind_params_by_name pass - there is a short example which might help here: Constant params should be constants - #3 by lhutton1. After running this you should see the params appear as Constant[] when printing the Relay module.
I would also recommend taking a look at the user-facing TVMC Python interface, which should automatically take care of these types of issues.
Btw, is there any multi-thread support for using ACL in TVM via BYOC? I tried to tweak the num_threads by os.environ[“TVM_NUM_THREADS”], it is effective for TVM-defined operators, but for offloaded ACL operators it makes no difference.
Hi, yes you’re correct that TVM_NUM_THREADS is only effective for TVM-defined operators. It’s been a while since I last looked at the ACL integration, but I believe this is just a case of not having been explored in much detail as of yet