Existing Support:
upsampling with args layout (NCHW or NHWC), scale (integer)
To be considered with Bilinear:
1: mode (new arg): To choose NN or BILINEAR
2: scale (modify): Make it a tuple to support asymetric scaling.
To Discuss:
1: Do we rename upsampling -> scale (ideally it enlarge or squeeze the input) ?
2: Scale factor or output resolution (Just thinking scale factor would be too fractional for minor change in output intended resolution)?
Asymetric scale : Means scaling from 100 to 210, the scale factor here is 2.1.
According to algorithm the stride and window values are dynamic here.
Solution I could think of:::
Further going into details of implementation the catch lies with x, y, x_diff, y_diff calculation for each pixel in output.
This can be computed while nnvm build process (from input and output shapes) and can be added to params list.
At run time we just substitute in Y = A(1-w)(1-h) + B(w)(1-h) + C(h)(1-w) + Dwh .
Asymetric scale : Means scaling from 100 to 210, the scale factor here is 2.1.
According to algorithm the stride and window values are dynamic here.
Solution I could think of:::
Further going into details of implementation the catch lies with x, y, x_diff, y_diff calculation for each pixel in output.
This can be computed while nnvm build process (from input and output shapes) and can be added to params list.
At run time we just substitute in Y = A(1-w)(1-h) + B(w)(1-h) + C(h)(1-w) + Dwh .
So you mean that we can support scale factor is not integer, right?
It is good. Because I know some implementation only support integer scale. Could you explain algorithm more detail ? If you can take an example, it is nicer. I am very interesting.
In each dimension we need to generate 150 extra pixels which fall between the 100 source pixels.
Some times there would be 1 between two and some times there would be 2 (hence window is not same across for asymmetric scale).
Bilinear approach tries to take 4 pixels around from input image to derive a new pixel in target with certain weight from each pixel.
1: Every pixel on scaled result require 4 pixels from input image.
2: The difference of target from the above 4 pixels on a scale of 1 are weights.
Only w, h are enough as (1-w), (1-h) will give weight from other pixels.
The below calculation for each target pixel from 4 pixels in input.
Y = A(1-w)(1-h) + B(w)(1-h) + C(h)(1-w) + Dwh .
Our approach to TVM:
As the input shape for a graph will be fixed we could have the “source pixels indexes” and “weights” for each pixel precomputed and stored in params.
Hence on target it’s just Y calculation from input & weights.
I have used Python to implement it and it should be easy. TF’s computation seems that it has one special attribute: align_corners. It will effect the result.
Ok. However, I am also trying to port my implementation into TVM topi. I am not very familiar with tvm compute mechnisum until now. So I want to ask for your advice. When I move my numpy implementation to tvm topi, how do I move these for loops into tvm.compute mechnisum? Let me show you the code example:
for b in range(batches):
for y in range(output_height):
for x in range(output_width):
for c in range(depth):
input_y = y * height_scale
y0 = int(math.floor(input_y))
y1 = min(y0 + 1, input_height - 1)
input_x = x * width_scale
x0 = int(math.floor(input_x))
x1 = min(x0 + 1, input_width - 1)
interpolation = input_data[b, y0, x0, c] * (1 - (input_y - y0)) * (1 - (input_x - x0)) + \
input_data[b, y1, x0, c] * (input_y - y0) * (1 - (input_x - x0)) + \
input_data[b, y0, x1, c] * (1 - (input_y - y0)) * (input_x - x0) + \
input_data[b, y1, x1, c] * (input_y - y0) * (input_x - x0)
output_data[b, y, x, c] = interpolation
The compute result is stored into output_data. I haven’t find one tutorial or example about it.
scale down need a little change in logic - hence not clear.
In a nutshell :
Compute takes out shape, lambda function.
Compute use the shape to generate iterators and calls the lambda to return compute logic based on the variables.
Later the lowering generates the ir based on the Iterators and computational logic.
You can refer and play around some existing sample code (may be up_sampling which may be closer to this).
Yes, I have read up_sampling example. However, as your github’s example, we have complex logic. Up_sampling example just pass the scale to H / W. Currently I have not understand how to port it into tvm.compute.(as you said, we have output_shape, one lambda. Then how to we combine our logic via its iterator variables?)