Hardware input on tvm

I am trying to understand how tvm uses the hardware attributes and where exactly. As I have seen in the code the graph level optimizations don’t use any hardware knowledge to make specific optimizations for the target. This means that the exported graph for a model will be the same for all the targets,am I right?
Now the next step after the nnvm.build is the tvm runtime, and there as I have seen in the code the target is used for specific optimization and scheduling. I see the different directories and functions under the topi folder but I cant find where it is decided which one to call. Basically, I cant understand how the tvm runtime part works. Can anyone guide me where to look?
Thank you in advance

Good question. Everything you said is right.

I suggest taking a look at this PR.

From NNVM, the target dispatch system kicks in here.

1 Like

I am also trying to find how the graph annotations are transformed to tvm operations. To be more specific here is an example:

squeezenet0_conv0_weight

is translated to these tvm operations:

%3 = tvm_op(%data, %squeezenet0_conv0_weight, %squeezenet0_conv0_bias, num_outputs=β€˜1’, num_inputs=β€˜3’, func_name=β€˜fuse_conv2d’, flatten_data=β€˜0’) %4 = tvm_op(%3, num_outputs=β€˜1’, num_inputs=β€˜1’, func_name=β€˜fuse_relu’, flatten_data=β€˜1’) %5 = tvm_op(%4, num_outputs=β€˜1’, num_inputs=β€˜1’, func_name=β€˜fuse_max_pool2d’, flatten_data=β€˜0’)

My question is where, in the code, this task is performed because I am trying to understand how the different functions that you can use in an nn are handled in the api

squeezenet0_conv0_weight stands for the weight of the first conv layer of squeeze net. So it is not correct to say that β€œsqueezenet0_conv0_weight is translated to three tvm ops (conv, relu, max_pool)”.

Graph(%data, %squeezenet0_conv0_weight, %squeezenet0_conv0_bias, %squeezenet0_conv1_weight, %squeezenet0_conv1_bias, %squeezenet0_conv2_weight, %squeezenet0_conv2_bias, %squeezenet0_conv3_weight, %squeezenet0_conv3_bias, %squeezenet0_conv4_weight, %squeezenet0_conv4_bias, %squeezenet0_conv5_weight, %squeezenet0_conv5_bias, %squeezenet0_conv6_weight, %squeezenet0_conv6_bias, %squeezenet0_conv7_weight, %squeezenet0_conv7_bias, %squeezenet0_conv8_weight, %squeezenet0_conv8_bias, %squeezenet0_conv9_weight, %squeezenet0_conv9_bias, %squeezenet0_conv10_weight, %squeezenet0_conv10_bias, %squeezenet0_conv11_weight, %squeezenet0_conv11_bias, %squeezenet0_conv12_weight, %squeezenet0_conv12_bias, %squeezenet0_conv13_weight, %squeezenet0_conv13_bias, %squeezenet0_conv14_weight, %3 = tvm_op(%data, %squeezenet0_conv0_weight, %squeezenet0_conv0_bias, num_outputs=β€˜1’, num_inputs=β€˜3’, func_name=β€˜fuse_conv2d’, flatten_data=β€˜0’) %4 = tvm_op(%3, num_outputs=β€˜1’, num_inputs=β€˜1’, func_name=β€˜fuse_relu’, flatten_data=β€˜1’) %5 = tvm_op(%4, num_outputs=β€˜1’, num_inputs=β€˜1’, func_name=β€˜fuse_max_pool2d’, flatten_data=β€˜0’) %8 = tvm_op(%5, %squeezenet0_conv1_weight, %squeezenet0_conv1_bias, num_outputs=β€˜1’, num_inputs=β€˜3’, func_name=β€˜fuse_conv2d_1’, flatten_data=β€˜0’) %9 = tvm_op(%8, num_outputs=β€˜1’, num_inputs=β€˜1’, func_name=β€˜fuse_relu_1’, flatten_data=β€˜1’) %12 = tvm_op(%9, %squeezenet0_conv2_weight, %squeezenet0_conv2_bias, num_outputs=β€˜1’, num_inputs=β€˜3’, func_name=β€˜fuse_conv2d_2’, flatten_data=β€˜0’) %13 = tvm_op(%12, num_outputs=β€˜1’, num_inputs=β€˜1’, func_name=β€˜fuse_relu_2’, flatten_data=β€˜1’) %16 = tvm_op(%9, %squeezenet0_conv3_weight, %squeezenet0_conv3_bias, num_outputs=β€˜1’, num_inputs=β€˜3’, func_name=β€˜fuse_conv2d_3’, flatten_data=β€˜0’) %118 = tvm_op(%117, num_outputs=β€˜1’, num_inputs=β€˜1’, func_name=β€˜fuse_softmax’, flatten_data=β€˜0’) ret %118 } graph_attr_keys = [storage_id, shape, dltype, dtype]

So, if I understand what you are saying, the lines in the beginning describe the type of parameters of each layer and the tvm_op describe the different functions that are used on each layer?

In NNVM, input data, learnable parameters, and operators are all represented as a node in a Graph. So, what you have above is just showing a list of nodes contained in squeeze net. The first line is a node for input data. Then come nodes for parameters, followed by nodes for operators.

I see. My next question is how can I see which operations are supported in nnvm

https://docs.tvm.ai/nnvm_top.html

Thank you very much. And something last, does each tvm operation coresponds to a single cuda or opencl kernel?

Yes, but a single kernel is typically a fused one, e.g. conv2d + batch norm + relu. Fusion is done automatically by NNVM when opt-level > 0.

1 Like