Dear All,
I am looking for a set transformation passes in TVM that helps in fusing/folding the Batchnorm ops into the previous or the next convolution-like layers.
My expectation :
-
before batchnorm fold : conv2d → bias_add → batch_norm
-
after batchnorm fold : conv2d (possibly changed weights) → bias_add (possibly changed bias)
The mathematics for this above transformation can be understood from the below image.
(source of the snapshot - Fusing batch normalization and convolution in runtime)
As of now I have been able to find SimplifyInference pass in TVM that is related to simplification of Batchnorm op.
But, from what I understand about this pass is that it separates out the constant terms from the batchnorm operation and folds it but the ops corresponding to the terms with data involvement are still present in the relay graph as two basic operations “Multiply” and “Add”.
Add ( Multiply ( data, scale ) , shift )
where :
scale is [gamma/sqrt(running_variance + epsilon)]
shift is [{-(running_mean * gamma)/(sqrt(running_variance + epsilon)}+beta]
I have applied “FoldScaleAxis” and “FoldConstant” pass in the same order after the Simplify inference pass which are not helpful in what I expect to achieve.
Can someone suggest if TVM has any other set of transformation passes working at the relay level that can help me get the expected Batchnorm fuse/fold transformation over the relay graph?
Thanks!!