Yes exactly! I feel your pain too
Some frameworks use full range eg [-128,127], others restricted [-127,127].
Even at operator level, using per-channel or per-layer is sometimes not sufficient. A point convolution should not deal with channels the same way a generic conv do. Likewise for a conv with groups etc. or even more fun with fusing/splitting operators.
So the framework should be flexible to use such dedicated quantizers if need be, maybe through some kind of pattern matching.
I like to use the video codec analogy: there are many ways to encode a video but you must have one way for any player to play it back.
In that sense, I see qnn ops as the decoder part, and this framework flexible enough to allow various encoding schemes over time. For validation, we could start with reproducing say TFlite way but we should not be limited by it (because TFlite is very very limited).