At work I only had limited access to capable Linux boxes with a GPU, but was really impressed reading about this project and the benchmarks. I wanted to the kick the tvm tires on my 16 core, NVidia 1080 (pretty slow compared to what all these guys use), but it was Windows and autotvm isn’t supported there.
I put in the time and have enough implemented to auto tune models we have on my Windows box. I hope it would be of use to someone else in a similar position as me and possibly in the future serve as a proof-of-concept for someone more knowledgeable to make official support. I do question this even as an importance as I don’t know how many Windows users are interested, and it could all be for nothing if Windows eventually gives GPU support in WSL (Won’t be in 2.0, but they have signaled they are interested)
I have uploaded a hodge-podge guide on google docs here. In no way is it a bullet proof, step by step guide. But if you are familiar with installing tvm from source on Linux, it hopefully is enough to get started on Windows. At the very least it’s some reference for myself that I am sharing
Great Job! Could you describe more details about USE_OPENMP ? What problem when we use TVM thread pool?
I am thinking about whether we could have one section named as “Resources”, which could contain community members docs, like this , like @mli’s Dive into Deep Learning Compiler. @tqchen
It’s been a few weeks but IIRC, it was a python threading.Thread.start() in check_remote(...) (measure_methods.py) was dead locking. The thread target would never run and it would never exit thread.start().
I remember using Process Explorer to investigate the thread stacks and found it was a Python thread being blocked deep inside the CPython runtime. TVM_NUM_THREADS=1 fixed it, but using openmp, I didn’t need to set it.
Got it, I didn’t realize you had already fixed this. Is there any concern about committing this back to the main repo? Or fixing the root cause (Oct 2019 commit)?
That code is part of autotvm, which of course isn’t officially supported in Windows, so I didn’t want to bug the reviewers and make a PR to fix something that generally doesn’t work anyways.
Would you mind sending out a PR with your changes to make it work on Windows?
I’m happy to review it - I make an effort to mention Windows support in all of the changes I review. I think we should formalize the effort to make AutoTVM work on Windows
I think it would be great to formalize Windows support and would be happy to work on that, but my gut feeling is not optimistic on the buy-in from the project owners…and probably to shy to ask.
One reason is the maintenance/testing on supporting Autotvm in Windows increases quite a bit. In my branch, I’ve been careful to preserve behavior of posix platforms, but I had to add a lot of 'if os.name == ‘nt’ in there. Future changes the project owners may want to make could be encumbered by having to support Windows.
I had to do a lot of little hacks to make Windows run close to Linux speed. Most of it because there is no fork (), threading in Python is poor, and multithread.Process is very slow to spawn…so I had to cache processes and process pools.
I recently added Windows support to the C++ RPC server, which was a big perf win and less hacky compared to what I had to do with the Python RPC server.
I think if more stuff can be pushed out of Python and into C++, the more chance of having a good Windows implementation…specially in the local_executor and xgboost_cost_model.
How about this: would you mind sending a PR off of your fork? On the PR, we can have more detailed discussion on the specific code design. I really think it would be worthwhile to start committing these changes back. I think the other reviewers / committers would be happy to see improved Windows support.
Too many error, and since the code dose not print to error properly, I format it a little bit
Traceback (most recent call last):
File "C:\Program Files\Python37\lib\site-packages\tvm-0.6.0-py3.7-win-amd64.egg\tvm\contrib\cc.py", line 185, in _windows_shared
link_cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
File "C:\Program Files\Python37\lib\subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "C:\Program Files\Python37\lib\subprocess.py", line 1207, in _execute_child
startupinfo)
File "C:\Program Files\Python37\lib\site-packages\tvm-0.6.0-py3.7-win-amd64.egg\tvm\_ffi\_ctypes\function.py", line 72, in cfun
rv = local_pyfunc(*pyargs)
File "C:\Program Files\Python37\lib\site-packages\tvm-0.6.0-py3.7-win-amd64.egg\tvm\rpc\server.py", line 84, in load_module
m = _load_module(path)
File "C:\Program Files\Python37\lib\site-packages\tvm-0.6.0-py3.7-win-amd64.egg\tvm\module.py", line 266, in load
_cc.create_shared(path + ".so", files)
File "C:\Program Files\Python37\lib\site-packages\tvm-0.6.0-py3.7-win-amd64.egg\tvm\contrib\cc.py", lin
raise InstantiationError("Skipped because of invalid gpu kernel")\ntvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel'))
Traceback (most recent call last):
File "C:\Program Files\Python37\lib\site-packages\tvm-0.6.0-py3.7-win-amd64.egg\tvm\_ffi\_ctypes\function.py", line 72, in cfun
rv = local_pyfunc(*pyargs)
File "C:\Program Files\Python37\lib\site-packages\tvm-0.6.0-py3.7-win-amd64.egg\tvm\autotvm\measure\measure_methods.py", line 621, in verify_pass
raise InstantiationError("Skipped because of invalid gpu kernel")
tvm.autotvm.task.space.InstantiationError: Skipped because of invalid gpu kernel