Thanks @ziheng for bringing up this topic! Automatic object generation is an important functionality that could significantly lower the burden of especially cross-language developers.
I think we all agree that the functionality itself is desirable, and the way we design the schema fields and customized methods.
It looks like our disagreement comes from the choice of language, i.e. Python vs JSON/YAML. It certainly makes sense that more discussions are meaningful because the choice will profoundly impact on all the future development.
I would love to compare Python and JSON in the following dimensions.
D1. Parsing. As a general purposed language, Python is certainly harder to parse than JSON, and as @tkonolige said, there are fewer libraries support. If some day we want to implement a parser in C++/rust for Python, it would be much more trouble than JSON. However, I would argue that
- Parsing is not part of TVM’s compilation pipeline - it is only the object schema’s compilation pipeline.
- Technically, as @tqchen said, we neither really need a parser for python, nor touch the python AST - we can inspect on the python class instead, using python’s standard APIs, which is more stable than Python AST itself.
Therefore, the case is that in Python we don’t really need to parse or touch the AST, while in JSON we need. IMO, my understanding is that python is not that bad in this particular task, especially when we do not really need to do parsing. I would score 9/10 for Python, and 10/10 for JSON.
D2. Concision. JSON is supposed to be (somewhat) human-readable. However, it is not always the case in reality. In our particular case, we can define an object in python with each of its fields listed in a single line, like:
class Add(PrimExpr): # <= inheritance
""" docs """
lhs: PrimExpr # <= name and type of the field
rhs: PrimExpr
# you can insert whatever comments too
However, in JSON, it lacks the concise support of those structure. It gets long and tedious quite easily, especially when you want to bring in customization.
Therefore, due to the fact that JSON does not provide the functionalities that a language usually have (e.g. succint type annotation, comments), I personally prefer Python to JSON in concision, with Python 10/10 and JSON 5/10.
D3. Recursive containers. Containers are used quite frequently in TVM. For example, we do have to handle types like Array<PrimExpr>
, Map<Instruction, Array<ObjectRef>>
. With python syntax, we can write it in a single line with python’s builtin type annotation:
array: List[PrimExpr]
map: Dict[Instruction, List[ObjectRef]]
However, as we have discussed in D2, JSON does not provide approachable syntax to allow customization in a human-readable way. Just imagine what it will look like in JSON.
IMHO, for open source contributors who have limited time to learn the core infra, the tooling must be really easy to use. we would prefer that the core developers to spend a bit longer (although not actually) to make the infra cool and robust, rather than asking OSS contributors to spend their valuable time figuring things out. In this particular case, I really want to advocate Python than JSON. It is a 10/10 vs 3/10.