Thanks for the discussion! For the sake of being thorough (though this is a very minor change), I’m concluding this thread with the final design decisions.
I have made this an analysis ModulePass that gathers the fused primitive functions in a result IRModule’s functions.
Please add further suggestions to the PR itself.