Dear everyone,
I’m trying to make a “mobilenet-v2-7.onnx” can be distributed on multiple CPU cores through TVM. I see an example for sharding in the section “Sharding CONV Op” at here, but the sharding is manual in this example.
Currently, I am trying to use “R.dist.annotate_sharding()” for automatically sharding the tensor with a simple convolution layer as test case, but got some errors. Could I ask if it’s possible to use TVM to “automatically shard” whole “mobilenet-v2-7.onnx”? The error message is
InternalError: Check failed: (op_map_dist_infer_struct_info_.count(op)) is false: Cannot find the dist.FInferStructInfo attribute registered to op: relax.nn.conv2d
The example of a model with a simple convolution is
@I.ir_module
class ConvolutionModule_1:
I.module_attrs({"device_num": 2})
I.module_global_infos(
{
"mesh": [
R.device_mesh((2,), I.Range(0, 2)), # mesh[0]
]
}
)
@R.function
def main(data: R.Tensor((1, 3, 224, 224), dtype="float32")) -> R.Tensor((1, 1000), dtype="float32"):
R.func_attr({"num_input": 1})
with R.dataflow():
data = R.dist.annotate_sharding(data, device_mesh="mesh[0]", placement="S[1]")
lv: R.Tensor((1, 32, 112, 112), dtype="float32") = R.nn.conv2d(data, metadata["relax.expr.Constant"][0], strides=[2, 2], padding=[1, 1, 1, 1], dilation=[1, 1], groups=1, data_layout="NCHW", kernel_layout="OIHW", out_layout="NCHW", out_dtype="void")
R.output(lv)
return lv
Thanks for help!