---------------------------------------Update on the issue------------------------------------- I had an older build of tvm on my system. And the paths were set to that build. However, upon running the build through async server, it would clone the latest version and the paths would get set to latest build for that terminal. The lastest build has _serve_loop in server.py as local object and hence is unable to get pickled. This is not the case in older build. I request the developers to look into it.
Some Context: I am working on a personal project that involves an async server (asyncio library) that listens to client requests. The clients can request for autotuning from server, the commands for which are executed by async server. Currently I am testing everything on localhost with target device (client) connected via LAN cable.
After successfully building tvm on client device, setting up tvm python paths and all the other necessary steps, the commands are executed in the asyncio server script in the given order:
- Async Server Side:
python3 -m tvm.exec.rpc_tracker --host=192.168.55.2 --port=9101
- Client Side:
python3 -m tvm.exec.rpc_server --tracker=192.168.55.2:9101 --key=abc --no-fork
- Async Server Side:
python3 RPC_autotune.py --key=abc --port=9101
I would like to bring it to attention that these commands in the given order work perfectly as intended when run through command line manually. They also work as intended when I execute them using subprocess.Popen manually.
However, when executed within the async server script using asyncio.to_thread
and subprocess.Popen
or asyncio.create_subprocess_shell
, the server does connect to the tracker initially as follows:
abc@xyz:~$ python3 -m tvm.exec.query_rpc_tracker --host 192.168.55.2 --port 9101
Tracker address 192.168.55.2:9101
Server List
-------------------------------------
server-address key
-------------------------------------
192.168.55.1:9090 abc
-------------------------------------
Queue Status
-------------------------------------------
key total free pending
-------------------------------------------
abc 1 1 0
-------------------------------------------
But when it gets ready to start autotuning, the client device disappears from Server List and goes into a pending state as follows:
abc@xyz:~$ python3 -m tvm.exec.query_rpc_tracker --host 192.168.55.2 --port 9101
Tracker address 192.168.55.2:9101
Server List
-------------------------------------
server-address key
-------------------------------------
-------------------------------------
Queue Status
-------------------------------------------
key total free pending
-------------------------------------------
abc 0 0 2
-------------------------------------------
Upon further logging on the client device, the following error was reported:
nvidia@ubuntu:~$ cat log_file.log
2023-05-23 16:12:14.665 INFO bind to 0.0.0.0:9090
2023-05-23 16:12:42.234 INFO connected from ('192.168.55.2', 38360)
2023-05-23 16:12:42.251 INFO start serving at /tmp/tmpy23kstl4
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/nvidia/tvm_python/tvm/rpc/server.py", line 272, in _listen_loop
_serving(conn, addr, opts, load_library)
File "/home/nvidia/tvm_python/tvm/rpc/server.py", line 143, in _serving
server_proc.start()
File "/usr/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/usr/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/usr/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/usr/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/usr/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/usr/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object '_serving.<locals>._serve_loop'
This issue is only faced while trying to execute the mentioned commands within the async server and not via command line. It would be helpful if someone could suggest me a workaround or a way to fix this error. Thanks.