In collaboration with @areusch
A target object in TVM provides all the information needed for code lowering and code generation.
Currently, a target is specified by a string in the format of <target-id> [attributes]
, where
the <target-id>
is defined by the name of the final code generator(llvm, cuda, opencl).
Many new challenges arise as we introduce advanced features such as cross-compilation,
heterogeneous targets and customized code generator to the compiler stack:
- C0: Represent the host compiler information: when we cross-compile to WebGPU,
we also need to specify the
wasm
as the target for the host code. - C1: Represent composite targets in heterogeneous compilation settings.
- C2: Customizing compilation passes and code generator based on the target.
- C3: Simple canonical name(e.g.
nvidia/tx2
,broadcom/rpi4b
) to refer to the target of interest. - C4: Pass in additional target-specific attributes like ISA/library extension to a target.
This RFC proposes a strawman to address these problems. See also our previous discussions on this topic Target and Attributes
Scope of a Target
While a Target implies things about the runtime environment of generated code, the Target configuration is only intended to configure the TVM compiler. Options that are specific to the runtime environment should configure the DLContext rather than the Target. As a simple way to decide whether a configuration option belongs in the Target config, ask the question
Would this option change the contents of the generated Module?
Some examples of things that belong in the target string:
- ISA and ISA extensions (i.e. FPU presence) for the target platform
- Availability of C libraries
Some examples of things that purely change the runtime execution environment:
- Linker options, for CPU-targeted code:
- Link-time optimization
- Code location (i.e. FLASH vs RAM)
- Options that influence code loading
- i.e. for micro targets: configuration for the flash tool
Strawman Proposal
We propose to use json encode a target. The following example shows a target that represents the GPU on an nvidia TX2 board.
{
"id": "cuda",
"tag": "nvidia/tx2-cudnn",
"keys": ["cuda", "gpu"],
"libs": ["cudnn"],
"target_host": {
"id": "llvm",
"system_lib": True,
"mtriple": "aarch64-linux-gnu",
"mattr": "+neon"
}
}
The top-level fields of this target include:
- id(string): specifies the kind of the target
- tag, optional(string): is a special attribute that specifies the alias of the target(think of tag in docker images).
- A tag can be used to reconstruct the entire target uniquely.
- tag is also used as keys in the autotvm logs, to make sure tag is immutable, we can also hash the content of the target and record that in the log.
- keys(
Array<string>
): List of keys that can be used to derive specific autotvm strategies.- keys provides a more coarse grained information about the target than the tags
- An alternative is to simply unify the keys with the tag, by allowing special tags like
cuda
that does not corresponds to a concrete target
- attrs(
Map<String, ObjectRef>
): Other optional attributes- target_host(Target): The host execution target, if different from the current host.
- libs(
Array<string>
): List of additional extensions that can affect lowering.
In the c++ side, we can store the special attributes as typed fields and additional attributes in a Map.
class TargetNode {
public:
// id
TargetId id;
// special attributes
String tag;
Array<String> keys;
// other attributes
Map<String, ObjectRef> attrs;
};
TargetId Registry
To support C2, we introduce a registry for per target-id global attributes. The following code block shows how to register the target specific passes and attribute options for the cuda target.
TVM_REGISTER_TARGET_ID("cuda")
.set_attr<TPreCodegenPass>("TPreCodegenPass", [](Target target) -> Array<Pass> {
//...
})
// add target host as an additional option.
.add_attr_option<Target>("target_host");
Target Tag
The tag is a short alias for the target. We will maintain a collection of tags and the corresponding target configurations in the project. Users can directly specify their target via a tag. We can create multiple aliases can for each tag. However, there should always be a canonical tag that all aliases maps to. The canonical tag is used in the tag field of the target.
Here is an example list of tags. A typical naming choices for tags include the <vendor>/<soc-name>[-device]
for an SOC and <cloud-provider>/<instance-type>[-device]
for a cloud instance.
- nvidia/gtx2080ti
- apple/iphone8-cpu
- aws/c4.xlarge
- rockchip/rk3399-gpu
It is also useful to create a hierachy among tags. For example, rockchip/rk3399-gpu
is a special case of mali GPU, and its performance information can be useful for other SoCs with the same type of GPU. This information can be captured by keys(feel free to suggest alternative names) .
Depending on the need, we could also optionally attach a version suffix in the end, apple/iphone8-cpu:v1.0
, this convention might be useful for upgrading the target spec of a given tag.
Schema Validation
The new target specification is quite flexible, as we can introduce new attributes to the target. However, additional flexibility also increases the chance of misconfigurations and typos. We propose to validate the keys of the attributes, as well as the type of the corresponding values when constructing from a json object using information registered to the target registry. The following example shows how to register attribute options for the LLVM(llvm cpu) target.
TVM_REGISTER_TARGET_ID("llvm")
.add_attr_option<Bool>("system_lib");
.add_attr_option<String>("mtriple");
.add_attr_option<String>("mattr");
Composite Target
Under the current proposal, we can introduce a special target id to represent a composite target. The composite target lowering invokes partition passes to search and partition the function into several functions. Each of the partitioned functions corresponds to a primitive target in the composition. The compiler will call into the custom lowering pass of each specific target, then link everything together.
{
"id": "composite",
"targets": [ { "id": "llvm" }, { "id": "cuda" } ]
}
Notably, we can also introduce multiple (named) composite targets if we need to customize the lowering strategy.
Bring your own codegen
Under the new proposal, BYOC can be supported naturally via the target-specific lowering infrastructure. A customized backend can register its own target id and a codegen generator attribute.
TVM_REGISTER_TARGET_ID("dnnl")
.set_attr<TRelayCodegen>(DNNLRelayCodegen);
Discussions
The hierarchical nature of composite targets brings many new challenges. For example, we can no longer use a plain string option format for general targets. It would be great to bring more discussions, possibly around the following, but not restricted to:
- Naming alternatives:
- id: (
target_key
,target_type
,name
,kind
) - tag: (
hardware
,device
) - keys: (
category
,groups
)
- id: (
- TargetNode c++ design choices:
- N0: keep typed special attribute fields
- N1: fold most fields into attrs
Target host covention
- T0: keep target_host as an optional field in the device target. Rationale: host driving code is a special part of the device driver code. It is easier to pick up the keys in the top-level target when we run auto-tvm, we need to preserve target host info util the very end.
{ "id": "cuda", "target_host": { "id": llvm }}
- T1: treat target host configuration as a composite-style target configuration. Rationale: it is not that different from composite.
{ "id": "composite",
"target_host": { "id": llvm },
"targets": [ {"id": "opencl"}]
}