Teach NNVM recognize tensorflow model

srkreddy1238 · September 19, 2018, 3:42am

@jjiang2cal

You may try this initial version of changes where I could compile Resnet_v2 via tensorflow frontend.

I am planning to PR it soon.

jjiang2cal · September 19, 2018, 9:55pm

Yes set the --labels_offset=1 flag when exporting inference graph solves this problem. Thanks.

jjiang2cal · September 19, 2018, 11:08pm

@srkreddy1238

Thanks for the quick commit!

When I tried tf slim models of resnet 50 v1 and v2 (https://github.com/tensorflow/models/tree/master/research/slim), I got NotImplementedError: Please freeze the graph with add_shapes=True. I use the freeze script from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py, and it does not have an add_shapes option. Is there any other freeze_graph I should use?

Sorry for bothering you so much

srkreddy1238 · September 20, 2018, 4:23am

freeze_graph.py --input_saved_model_dir=20180601_resnet_v2_imagenet_savedmodel/1527888387/ --output_graph=frozen_model-v2-fp16.pb --output_node_names=ArgMax --clear_devices

I use this command to freeze the model.

Ref.

helper function as shown below can be used to add shapes.

graph_def = nnvm.testing.tf.AddShapesToGraphDef(‘softmax’)

jjiang2cal · September 20, 2018, 5:41pm

@srkreddy1238

This is the official model I first used. The input node and output node of this model (inspected by https://github.com/tf-coreml/tf-coreml/blob/master/utils/inspect_pb.py) are:

0: op name = import/input_tensor, op type = ( Placeholder ), inputs = , outputs = import/input_tensor:0
@input shapes:
@output shapes:
name = import/input_tensor:0 : (128, 224, 224, 3)
......
 702: op name = import/ArgMax, op type = ( ArgMax ), inputs = import/resnet_model/final_dense:0, import/ArgMax/dimension:0, outputs = import/ArgMax:0
@input shapes:
name = import/resnet_model/final_dense:0 : (128, 1001)
name = import/ArgMax/dimension:0 : ()
@output shapes:
name = import/ArgMax:0 : (128,)

Since NNVM/TVM does not support batch_size > 1, so I set batch_size to 1.
With you patch, it compiled to nnvm successfully .
But I have questions of inference. The output of the graph is ArgMax, so it is the class number of the classification. And if batch_size is 1, the output shape should be (1,). I used the code below to inference:

......
out = module.get_output(0)
tvm_out = out.asnumpy()
print(tvm_out)

It printed out [766 774 457 766 729 701 824]
while I expected 230. And I don’t understand why it is a 7-element vector.

Below is one picture I used for inference. It is from the ImageNet dataset, classified as 230: ‘Shetland sheepdog, Shetland sheep dog, Shetland’.

ILSVRC2012_val_00000003

Do you have any insights how I should do the inference? Thanks a lot.

jjiang2cal · September 20, 2018, 6:05pm

@srkreddy1238

For the research slim model,
graph_def = nnvm.testing.tf.AddShapesToGraphDef('resnet_v2_50/predictions/Reshape_1')
does solve the add shape error. But during conversion,

File "from_tensorflow_slim_v2.py", line 124, in <module>
    graph, lib, params = nnvm.compiler.build(sym, target, shape_dict, params=params)
  File "/tvm/nnvm/python/nnvm/compiler/build_module.py", line 270, in build
    ishape, _ = graph_util.infer_shape(graph, **shape)
  File "/tvm/nnvm/python/nnvm/compiler/graph_util.py", line 31, in infer_shape
    graph = graph.apply("InferShape")
  File "/tvm/nnvm/python/nnvm/graph.py", line 234, in apply
    check_call(_LIB.NNGraphApplyPasses(self.handle, npass, cpass, ctypes.byref(ghandle)))
  File "/tvm/nnvm/python/nnvm/_base.py", line 75, in check_call
    raise NNVMError(py_str(_LIB.NNGetLastError()))
nnvm._base.NNVMError: Error in operator resnet_v2_50/SpatialSqueeze: [17:58:16] /tvm/nnvm/src/top/tensor/transform.cc:693: Check failed: shp[i] == 1 (7 vs. 1) The squeezed axis must have shape 1!Want to squeeze 2, which has shape7

The input, output and the resnet_v2_50/SpatialSqueeze nodes are as below:

0: op name = import/input, op type = ( Placeholder ), inputs = , outputs = import/input:0
@input shapes:
@output shapes:
name = import/input:0 : (?, 224, 224, 3)
......
1762: op name = import/resnet_v2_50/SpatialSqueeze, op type = ( Squeeze ), inputs = import/resnet_v2_50/logits/BiasAdd:0, outputs = import/resnet_v2_50/SpatialSqueeze:0
@input shapes:
name = import/resnet_v2_50/logits/BiasAdd:0 : (?, 1, 1, 1001)
@output shapes:
name = import/resnet_v2_50/SpatialSqueeze:0 : (?, 1001)
......
1767: op name = import/resnet_v2_50/predictions/Reshape_1, op type = ( Reshape ), inputs = import/resnet_v2_50/predictions/Softmax:0, import/resnet_v2_50/predictions/Shape:0, outputs = import/resnet_v2_50/predictions/Reshape_1:0
@input shapes:
name = import/resnet_v2_50/predictions/Softmax:0 : (?, 1001)
name = import/resnet_v2_50/predictions/Shape:0 : (2,)
@output shapes:
name = import/resnet_v2_50/predictions/Reshape_1:0 : (?, 1001)

jjiang2cal · September 20, 2018, 6:37pm

@FrozenGene

Did you have

File "/tvm/nnvm/python/nnvm/frontend/coreml.py", line 182, in PoolingLayerParams
    raise NotImplementedError("Other convolution padding not implemented")
NotImplementedError: Other convolution padding not implemented

when converting coreml to nnvm? (The coreml model is converted from research slim resnet50 v2 tf model.)

FrozenGene · September 21, 2018, 5:35pm

I have done many things for CoreML. For convolution, I have support SAME / VALID using 4-D padding (haven’t contributed back to community, will do soon) And for pooling, also support its padding completely too. So, I really cam not figure out the detail error only having this information. I suggest converting .mlmodel to Text format, you could Google it how to do it and then check what is this layer detail information.

srkreddy1238 · September 22, 2018, 4:37am

@jjiang2cal

I know the shape operator issue above resnet_v2_50, I will try sharing the fix for it soon.

srkreddy1238 · October 24, 2018, 11:47am

With this tensorflow frontend could support all models(Inception, Resnet, MobilenetV1/V2, Vgg) from research/slim.

As all models can’t be integrated into TVM test cases. I have added some utils to validate https://github.com/srkreddy1238/dmlc_data/tree/master/work/tf/samples for reference.