Unable to customize the optimization passes in auto tuning

I’d like to customize the optimization passes before extracting the operators to tune, something like:

    with relay.build_config(opt_level=2, required_pass="AlterOpLayout"):
        tasks = autotvm.task.extract_from_program(
            func, target=target_platform, params=params, ops=tune_ops)

However, the context data is lost as the extract_from_program runs in another thread:

Any idea to keep the customization in tuning?

This won’t work. The part you highlighted was just extracting tunable ops by running through the graph. It’s nothing to do with the op_level. Also, one tuning task contains only a single op, so it only tunes the op schedule without considering the graph level optimization.

my op is special, its input shape depends on the previous op’s output shape. I’m thinking of two stage tuning. First, tune regular ops and figure out the best output shapes, then tune my ops, as the input shapes are all fixed at that time.

So, I do need to run AlterOpLayout when tuning my op. Oh, I also need to run:

with autotvm.apply_history_best(log_filename):

to get the best output shapes before tuning my ops.

Does that provide more context for the things I’d like to do?

Thanks.

It looks to me that your requirement is not supported by the current AutoTVM, but you may try a workaround. First tune the regular op and figure out the best output shapes as you said. After that, you can do something like follows

# Only keep the best record
autotvm.record.pick_best('history.json', 'best.json')
with open('best.json', 'r') as f:
    for line in f: # Should only have N lines, where N is the number of regular ops
        row = json.loads(line)
        cfg = row['i'][5]['e'] # This is the config factors found by AutoTVM
        out_shape = your_function_to_get_output_shape(cfg)

autotvm.task.create(...) # Create a task of your op based on the out_shape

Thanks. That is indeed a workaround. Just a lot of book keeping on my side.

Just curious, why we need to create a separate thread? Just for performance reasons or there are some other considerations? I removed the multithreading in this case and the optimization pass AlterOpLayout did get run. I didn’t notice anything unusual.

Not sure about it either, but it may trigger Python multiprocessing issues in some cases according to the comment.