Is there a way for us to compare intermediate results of relay graphs across targets?
Debug executor already provides a way to get the intermediate node results of optimized/transformed graphs (and much more!), but target specific transformations makes one-to-one correspondence between intermediate computations of the transformed graphs across targets difficult (because optimized/transformed graphs for different targets need not be the same).
We were trying out the following:
- An analysis pass that records all intermediate nodes in a relay graph very early (eg. immediately after generating relay model from say, tf model).
- A transformation pass that marks all such intermediate nodes as outputs of a new graph.
Our attempt is similar to Dump the intermediate compute result (and does it automatically).
We could get very close match (in terms of the number of output nodes) for x86 and non-x86 target for a couple of real models.
But I’m interested to know better ways of achieving this.
Such a feature will be useful in quickly isolating issues when running whole models for new targets, as it helps us quickly find the first (and the same) intermediate node for which the outputs differ for the two targets.
This will also be useful in cases when existing unit-tests for op implementations do not cover the particular problematic tensor values encountered during a model run with real inputs.
Thank you very much in advance!