After a second thought, I realize that for certain op like arange requires dynamic attributes. During the shape inference, it only has access to attributes. Therefore, when shape depends on dynamic value instead of input shapes, such as start/stop/step in arange, we have to put these dynamic values in the attributes.