To explore the root cause, I read its source code and find that TVM seems to assume all arguments of operators are valid in src/tir/op/op.cc. If we pass None as a parameter, it will be wrapped as an object whose data_ is equal to nullptr. And when we execute x.dtype(), according to its code, it will try to access data_->dtype, an illegal memory address since data_ points to nullptr.
// get() defined in ObjectRef
T* get() const { return static_cast<T*>(data_); }
// dtype() defined in PrimExpr
DataType dtype() const { return static_cast<const PrimExprNode*>(get())->dtype; }
To enhance the robustness and remind developers of their incorrect use, I think it’s necessary to raise an exception here instead of crashing the program. Is my idea correct?And if yes, how can we fix this unexpected behavior? I’d appreciate any help
Good question! I think the best solution is don’t allow nullptr for TVM Object except Optional object. Let me take For stmt constructor as an example.
Thanks for replying! It’s very exciting to know that TVM is trying to fix these legacy problems. And I provide some cases which will trigger crash when receiving NULL object in a GitHub issue. I hope this can help TVM to solve these problems!
In addition, some C++ APIs also need to be changed to tvm::runtime::Optional. For example, tvm.tir.Var(..., ..., span = None) takes None as default parameter for span, but in the C++ codebase, the span attributed is not marked as Optional<Span>.
I guess there are 2 approaches to resolve such issues:
use Optional for those that can be nullptr and do checking in TVMPODValue_::AsObjectRef().
if (type_code_ == kTVMNullptr) {
CHECK(TObjectRef::_type_is_nullable)
<< "Expect a not null value of " << ContainerType::_type_key;
+ ICHECK(ArgNullable<TObjectRef>::value) << "Passing a null value to " << ContainerType::_type_key << " is not allowed";
return TObjectRef(ObjectPtr<Object>(nullptr));
}
pros: “ultimate”
cons: need break existing APIs. For those who take Span as parameters, we need to convert it into Optional<Span>. And the use case will change to span.value()... making massive code changes.
compatible change: check if the parameters are nullptr in set_body statements.
pros: easier and less change.
cons: not fundamental and such checking statements scales when new APIs coming in.