TVM seems Conv2D has relate big floating point error when to from CoreML

I am investigating ssd mobilet model and find that tvm can not point right position. Then I write one simple test case using tensorflow, just one Conv2D operation. I transform tensorflow model to CoreML, I find that CoreML’s output almost the same as Tensorflow. But when I eat CoreML model and find our TVM’s output is different CoreML / Tensorflow’s output. And sadly, CoreML / Tensorflow is the right answer. Could anyone explain it and provide advice to fix it?

Attachment code:

input_tensor = tf.placeholder(tf.float32, [1,224,224,3], name=“input_tensor”)
input_value = np.random.rand(1,224,224,3)
net = tf.contrib.slim.conv2d(input_tensor, 16, [3,3])

If anyone want tensorflow model / CoreML model / test script, I would like to providing it. Thanks in advance.

I test it. Find out that if the weight params are very small(for example, 2.0836826537487557e-16 and so on), we have the problem.

i see, it is likely due to floating point truncation prpblem?

@tqchen I find the answer now. I thought floating point truncation before like you. I spend much time to investigate it. However, the correct answer is we don’t support CoreML’s SAME padding rightly, when stride is 2, our computation method is not right. The link is: https://apple.github.io/coremltools/coremlspecification/sections/NeuralNetwork.html#samepadding

our output shape height / width is also not right, currently our C++ implementation in convolution.cc / Python’s conv2d_nchw is just the formula of CoreML’s VALID padding (For example, in C++:

    oshape[2] = (dshape[2] + param.padding[0] * 2 - dilated_ksize_y) / param.strides[0] + 1;

, we don’t consider CoreML’s SAME padding. But SAME padding is used very commonly and we need to support it. Currently I try to implement it and find that if we only pass just padding_top / padding_left is not enough, we need pass padding_top, padding_bottom, padding_left, padding_right to Symbol.Conv2D and we need make the NNVM know it is the SAME padding / VALID padding.

We had similar PR: https://github.com/dmlc/nnvm/pull/260. However, it is not enough for our purpose. We need know SAME padding of BOTTOM_RIGHT_HEAVY, SAME padding of TOP_LEFT_HEAVY, VALID padding. I would suggest we have one param like this PR named: padding_mode. We have four accept values: NONE(default), SAME_TOP_LEFT_HEAVY, SAME_BOTTOM_RIGHT_HEAVY, VALID. If we get the padding_mode is NONE, then we go the traditional way we do now but we make the user provide 4 input values (top, left, bottom, right), this will be very easy, just extend padding’s default shape from {0, 0} to {0,0,0,0}, ONNX just do like this. If we have non-default value, we handle it. For example, we calculate the padding by ourselves and ignore the padding params. This can be discussed how to be better. I am also in the progress of implementation.

I thought we like the Theano’s way(http://deeplearning.net/software/theano/library/tensor/nnet/conv.html?highlight=conv2d#theano.tensor.nnet.conv.conv2d), :

mode (int or tuple) – One of “valid”, “full”, “half”, an integer, or a tuple where each member is either an integer or a tuple of 2 positive integers.

However, we have C++ code to check the shape and padding’s type is TShape, we can not do it like Theano’s way. If you have any good idea and suggestions, I would like to listen.

To keep the IR minimum, so far we have specified padding as a tuple of two element. A simple extension would be allow padding to be tuple of four elements. And the later part indicate the padding on the bottom, and right if specified.

This will allow us to be able to meet most of the padding conventions without having to introduce automatic padding calcuations

@tqchen Though we can use four elements to specify padding, but our out_height / out_width should know what type of padding it is. Because SAME padding / VALID padding’s out_height / out_width’s computation method is differently. If we use this way, how do we solve this problem? Add the fifth elements to indicate what type of padding?

For the SAME_TOP_LEFT_HEAVY / SAME_BOTTOM_RIGHT_HEAVY padding:
out_height = ceil(float(in_height) / float(strides[1]))
out_width = ceil(float(in_width) / float(strides[2]))

For the VALID padding:
out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))

out_height / output_width is not only needed in the C++'s shape checking, it is also needed in the computation / scheduling part, so it is very import.

in both cases, the calculation could be

(in_height + pad_top + pad_bottom - filter_height) / strides[0] + 1

Note that valid padding means the pad value equals 0

you mean:

ceil ( (in_height + pad_top + pad_bottom - filter_height) / strides[0] + 1 )

or

(in_height + pad_top + pad_bottom - filter_height) / strides[0] + 1 ?

I think you should mean ceil ( (in_height + pad_top + pad_bottom - filter_height) / strides[0] + 1 ) ?

BTW, I want to ask one simple question, for example, I get the insym from model,

def ConvolutionLayerParams(op, insym, symtab)

How to I get the input_height / input_width from insym? Because I find this is symbol, it is hardly to get the input_height / input_width and so on, but I think it should have methods to do it. If we want to calculate using padding(top, left, bottom, right), we should calculate the padding during the model translation part.

I mean ( (in_height + pad_top + pad_bottom - filter_height) / strides[0] + 1 )

Related to partial shape inference issue. https://github.com/dmlc/nnvm/blob/master/python/nnvm/frontend/onnx.py#L391

Ok, I see.

Your provided link I find that we are using one hack to do it.

Now, we add two more attributes to do it, our scheduling / computation part are all related, because our logic only need HPAD, WPAD previously. It seems that it is not small work, very sadly. :frowning:

most of the work will result in symmetric pad values, which falls into previously known case. We can start from here and add support for asymmetric pad in topi later

Yes. I understand. However, if we support CoreML’s SSD MobileNet model and so on, which is unfortuately asymmetric pad. Currently my implementation is padding_mode passed by users, such as SAME_TOP_LEFT_HEAVY, SAME_BOTTOM_RIGHT_HEAVY and so on. I just make bottom_pad / right_pad be 0 in scheduling part. I want to make the whole model running successfully and have the right result. Luckily, I have the same result as CoreML in MobileNet model. Previously, we can predict ‘tiger cat’ of ‘cat’ image, but the floating point number result is not correct. Now we have the same output as CoreML.

I am working on SSD MobileNet and will change the implementation method as we discussed, drop the padding_mode.