All the the alternatives (A1a, A1b, A1c), should be able to cover the need that we initially bought up – around N3. Additionally, the Target system as it is now is already powerful enough to resolve the N3 related needs that was bought up, as the alternatives @junrushao listed along the A1c direction.
In all cases , it is certainly possible to resolve the problems with extra layers of abstractions and indirections. As a matter of fact, they are all very similar, except for how the data structure itself is built up.
So the main thing that would be helpful here is to understand the tradeoffs here under different contexts, given our previous discussions was focused around N3, it is also helpful to look at things from other needs.
To give some examples:
From N0’s pov, the ability to directly pass in Target with a host field is a good default solutions for this most comon combo, so in the case of API/UX design, we might want to encourage this kind of usage without worrying about additional fields for hetergenous setups in a config.
build(mod, Target("cuda", host="llvm"))
Additionally, the transition between E0 to E1 encourages a transition from Target with host field(that indicates a mixed host program) to a device only(without host).
From N2’s perspective. aws/c5
favors deployment target as a holistic thing(aka at the config level).
build(mod, "aws/c5")
Under the context of config and target, we will need to be able to say that a tag can refers to either a config and Target, which effectively complicates the tagging system and explaination here. Additionally, there will be needs to have a common mechanism to register the tags for both target and config. Making them more uniformed would make this perspective more streamlined.
From the N4’s pov, we will need to be able to represent the objects during decompositions, which means there will be need of smooth transitions of related information at the function level. For example, for some a function that involves mixed target host/device mixing the transitions to a device only. If that entails a difference in terms of the “target constraints”, e.g. for functions with multi-target it starts with a “config” attr, then for functions with a single device it becomes a “target” attr. Such transition is not as uniform.
In the context of N5, there will be a need to be able to log both single device target, or multitarget config as part of the autotuning logs in the same way. From the automation’s pov they are all “target constraints” of a function, or a collection of functions. As in N4, this would favor a single entity that captures the “target constraint” in an uniformed way, or at least a unified serialization mechanism and perhaps repr printing that covers the target involved.
Finally, we need to consider the overall UX perspectives about how to articulate to the user. On one hand we can certainly introduce a lot of concepts to the users in their most complete form. But the best APIs(e.g. keras is a great example) always aim to present to the users its simplest form for most important usecases.
Then we would get to a point where a user would ask “what is the difference between a config of a function that can run on multiple devices and a target of a function that only runs on one device?” While we can certainly come up with an answer. From UX point of view the tag of aws/c4
(can indicate a config that involves runtime env) and nvidia/cuda12
(indicate a single target) are so similar, to the extent that a user might feel an artifical boundary in here.
Importantly, majority of users do not have to deal with a MultiTarget setting. It is also unlikely that they needs to deal with explicit setting executor or runtime if we have a proper tag or good default. So our most common use case is the setting that contains a TargetWithHost. We want to be able to maximize the ease of use in this setting. Only asking the user to learn about target that comes with a host field, plus the ability to tag is the simplest way to tell the story, without introducing the extra concept of Config.
So the UX story is like a journey :
- step0, useful for most common usecases: “you can use a target to specify the deployment environment constraint that you have on a single device, and you have the ability to tag the specification”.
- step1: generalizing the same story for heterogenous usecases, “you can specify a MultiTarget, which is also a target with a specific schema to specify heterogenous execution case, and fine-tune runtime, executor setting, BTW you get the same ability to tag and log them in the same way as step0”
And if an user do not want to bother to hear about the steps. There is a simpler story: "just pick a tag that closely matches the platform of your interest, for example aws/g4:gpu
".