[VTA]Dense calculation tutorial: about Graph-packing and quantization

woojinnn · December 22, 2020, 2:33am

Hi, I’m currently following this sample code(c8064b), and I couldn’t get some points. Here comes my question:

Why does this code limit the maximum and minimum value with “my_clip” function?
When you designate the data shape(code line 71~77), is there any rule that should be strictly followed? I mean, is there any special reason that their order should be like (batch_size // env.BATCH, in_feat // env.BLOCK_IN, env.BATCH, env.BLOCK_IN)? If there’s document, let me know.
I don’t get the role of code line 91 and 92. Why do they conduct right_shift and clipping?
At code line 105 and 106, why do they limit the range of value like (1 << (env.INP_WIDTH - 1))? May be related to question1.
Does this code guarantee the best performance? If so, does this code still guarantee the best performance even if I change the (batch, input_size, output_size)?

Thank you

woojinnn · December 22, 2020, 7:22am

Solved question 1~4 through another tutorial:

But I still can’t know the answer of question #5.

MengboZ · December 7, 2021, 2:23am

@woojinnn Hi, I was wondering if you implemented nn.upsample in the graph_pack process. At present, I am trying to implement Unet through VTA, but I met some problems in the graph_pack process. I suspect the reason is nn.upsample or torch.cat. A full description can be found in another post I wrote in Can Upsample be implemented on VTA in graph_pack?. I have been confused here for a long time. I want to know if you have encountered this problem or can you offer some suggestions? Thank you.