AutoScheduler failed to find a valid schedule

gbxu · November 7, 2021, 1:36pm

thanks a lot @comaniac ! I read the rpc related issues then located my bug.

Before, I tried to parallelize my searching using multiple scripts.

CUDA_VISIBLE_DEVICES=0 python3 my_tvm_searching.py ...
CUDA_VISIBLE_DEVICES=1 python3 my_tvm_searching.py ...
...

my_tvm_searching.py will launch the localrpc, and it’s not the correct usage.

    measure_ctx = auto_scheduler.LocalRPCMeasureContext(min_repeat_ms=300)
    tune_option = auto_scheduler.TuningOptions(
        num_measure_trials=100,
        runner=measure_ctx.runner,
        measure_callbacks=[auto_scheduler.RecordToFile(log_file)],
        builder=tvm.auto_scheduler.LocalBuilder(timeout=100),
        verbose=2,
    )
    task.tune(tune_option)
    del measure_ctx

after fixing bug, my commands are as follow:

nohup python3 -m tvm.exec.rpc_tracker --host=0.0.0.0 --port=9190 & \
CUDA_VISIBLE_DEVICES=0 nohup python3 -m tvm.exec.rpc_server --tracker 127.0.0.1:9190 --key V100 --host 0.0.0.0 --port=9091 & \
CUDA_VISIBLE_DEVICES=1 nohup python3 -m tvm.exec.rpc_server --tracker 127.0.0.1:9190 --key V100 --host 0.0.0.0 --port=9092 & 
...
python3 my_tvm_searching.py

my_tvm_searching.py:

...
        runner = tvm.auto_scheduler.RPCRunner(key="V100", host="localhost", port=9190, n_parallel=8, min_repeat_ms=300, timeout=1000)
    tune_option = auto_scheduler.TuningOptions(
        num_measure_trials=100,  # change this to 1000 to achieve the best performance
        runner=runner,
        measure_callbacks=[auto_scheduler.RecordToFile(log_file)],
        builder=tvm.auto_scheduler.LocalBuilder(timeout=1000),
        verbose=2,
    )
    task.tune(tune_option)
...

NOTE: set n_parallel as #gpus, set timeout a proper number.

ref: https://tvm.apache.org/docs/how_to/tune_with_autoscheduler/tune_network_arm.html?highlight=parallel https://tvm.apache.org/docs/reference/api/python/auto_scheduler.html?highlight=tvm%20auto_scheduler%20rpcrunner#tvm.auto_scheduler.RPCRunner