Hello everyone,
I’m using a multi output keras model of yolov3. I’m able to get multi outputs from the model but they are wrong outputs. I tried changing targets to opencl and llvm alternatively with different opt_levels = [0-4] in tvm compiler with input data of type float32 but it seems none of them are correct to original model prediction or atleast not even close.
out_shape = [dim.value if dim.value is not None else 1 for dim in keras_yolo._output_layers[0].output.shape]
tvm_out = m.get_output(0, tvm.nd.empty(out_shape, ‘float32’)).asnumpy()
to
(n, h, w, c) = [dim.value if dim.value is not None else 1 for dim in keras_yolo._output_layers[0].output.shape]
tvm_out = m.get_output(0, tvm.nd.empty((n, c, h, w), ‘float32’)).asnumpy()
tvm_out = tvm_out.transpose([0,2,3,1])
Hey @kazum,
Thanks for your response. I’ve actually tried this possibility earlier. I understood, it is not about reshaping the output_values but about prediction values from the compiled model. I cross checked the every output value with tvm_model to keras_model (just in case of any miss alignments of tvm_output) but none of them were matched or atleast not close. So I assume that there is an issue with the keras to tvm compilation for this conversion.
I tried the following code and no assertions were raised. It looks like the outputs of Keras and TVM are at least close. If you get errors with my example, can you share your yolo.h5 file?
import keras
import numpy as np
import nnvm
import tvm
keras_model = keras.models.load_model('yolo.h5')
in_shapes = []
for layer in keras_model._input_layers:
in_shapes.append([1 if dim is None else dim for dim in layer.input_shape])
out_shapes = []
for layer in keras_model._output_layers:
out_shapes.append([1 if dim is None else dim for dim in layer.output_shape])
def get_tvm_output(xs):
def to_channels_last(shape):
return [shape[0]] + list(shape[2:]) + [shape[1]]
def to_channels_first(shape):
return [shape[0], shape[-1]] + list(shape[1:-1])
dtype='float32'
xs = [x.transpose(to_channels_first(range(x.ndim))) for x in xs]
sym, params = nnvm.frontend.from_keras(keras_model)
shape_dict = {name: x.shape for (name, x) in zip(keras_model.input_names, xs)}
graph, lib, params = nnvm.compiler.build(sym, "llvm", shape_dict, params=params)
m = tvm.contrib.graph_runtime.create(graph, lib, tvm.cpu())
for name, x in zip(keras_model.input_names, xs):
m.set_input(name, tvm.nd.array(x.astype(dtype)))
m.set_input(**params)
m.run()
tvm_out = []
for i, shape in enumerate(out_shapes):
out = m.get_output(i, tvm.nd.empty(to_channels_first(shape), dtype)).asnumpy()
out = out.transpose(to_channels_last(range(out.ndim)))
tvm_out.append(out)
return tvm_out
xs = [np.random.uniform(size=shape) for shape in in_shapes]
keras_out = keras_model.predict(xs)
tvm_out = get_tvm_output(xs)
for a, b in zip(keras_out, tvm_out):
np.testing.assert_allclose(a, b, rtol=1e-4, atol=1e-4)
@kazum thanks for this example. I made a mistake that I’m directly copying the output without considering the ‘channel_first’ analogy. I tried your implementation like i first copied to an empty array of ‘channel_first’ then I transposed to ‘channel_last’, Yes!!, they are closely relative and added some error term to replenish. It is finally worked. It was nowhere mentioned that outputs are also channel_first i.e[batch_size, C, H, W], hence this confusion, Please update docs/tutorials :). However, GPU to CPU copying speed (sync or async) bottleneck lagging made this implementation not suitable for my application. I would like to share some results of my implementation, those are helpful to someone: