[DISCUSS] Runtime Array Containers Array/ADT/String

tqchen · November 1, 2019, 9:17pm

Currently, we have a set of containers that can be used in the compiler.
As we started to build more powerful runtimes, we might want to
introduce certain container types to our runtime.

This is an RFC to discuss potential new container designs for the runtime.
This RFC focuses on the array-type container as they are simpler, more
lightweight and has a demand in the runtime.

Array: An Example

We need to consider the following factors when designing a
runtime container type.

F1: Have a POD-C compatible signature for cross language purposes.
F2: Being able to move c++ container into it.
F3: Reduce indirect memory allocation when possible.

We discuss the design for array as an example to demonstrate
how to get to these three points in the code below.

// Base array object
class ArrayObj : public Object {
 private:
  // Use 32bit size?
  uint32_t size_;
  // the capacity for allocation.
  uint32_t capacity_;
  // The elements in the array.
  ObjectRef* data_;

  // helper class to move from std::vector
  class FromStd;
};

// Move from std
class ArrayObj::FromStd : public ArrayObj {
 private:
   std::vector<ObjectRef> data_;

   explicit FromStd(std::vector<ObjectRef> data)
      : data_(data) {
      data_ = data_.data();
      CHECK_LE(data_.capacity(), UINT_MAX);
      size_ = static_cast<uint32>(data_.size());
      capacity_ = static_cast<uint32>(data_.capacity());
   }
   friend ObjectPtr<ArrayObj> ArrayFromVector;
};

// make string by move a string without copying the content.
ObjectPtr<ArrayObj> ArrayFromVector(std::vector<ObjectRef> data) {
  return make_object<ArrayObj::FromStd>(data);
}

// Special function to make array of size in a single block
ObjectPtr<ArrayObj> ArrayAlloc(size_t n, size_t cap) {
  // simplified code, need additional code to handle alignment
  void* ptr = malloc(sizeof(ArrayObj) + cap * sizeof(ObjectRef));
  ArrayObj* header = static_cast<ArrayObj*>(ptr);
  // initialize the header.
  header->data_ = reinterpret_cast<ObjectRef*>(
     static_cast<char*>(ptr) + sizeof(ArrayObj));
  header->size_ = size;
  header->capacity_ = cap;
  // initialize other fields, such as destructor
  //
  return make_object<ArrayObj::FromStd>(data);
}

Discussions about the code.

vector is not exposed in ArrayObj(F1), instead we directly use POD data.
data field is necessary to implement F2 (see FromStd).
This field allows us to move in from std::vector container without copying the content.
When creating an array by ourselves, we can directly allocate the
data chunk right after the ArrayObj header (ArrayAlloc).
However, we will need a special allocator support to do so.
For our use cases in IR, it is unlikely that an array will
go beyond 32bit int, so we could use 32 bit int to store the data,
altough it might be overkill to save 8 bytes.
The total overhead of array(without additional elements) is 4 * 8 bytes (if we use uint32 for size).

We can make several possible design variations.

If we do not want the ability to move from an existing container, we can remove the data field.
If we do not need frequent resizing of the array(e.g. push_back),
then the capacity field is not necessary (which is useful for Tuple and ADT).
Wehether it is worth to save 8 bytes by making the size and capacity int32,
or should we go with int64_t.

Proposal for other Containers

In this section, we propose two other containers: ADT and String.

// Base ADT object
class ADTObj : public Object {
 private:
  // tag for ADT
  uint32_t tag_;
  // the capacity for allocation.
  uint32_t size_;
  // The elements in the tuple
  // ObjectRef* data_;
};

In the case of ADT, we do not need to resize.
As a result, we can simply remove the capacity field and replace it by a tag field.
It may or makes sense to introduce the data field.
Given that we mostly want to allocate the content, we might as well remove the data field.
Note that means we lose the ability to move from an existing container.

// Base String object
class StringObj : public Object {
 private:
  // the capacity for allocation.
  uint64_t size_;
  // The elements in the array.
  char* data_;
};

String is another interesting case.
In this case the ability to move from std::string is useful(which means we need data_).
Size may need to be 64bit in case we use it to pass a big binary blob.
It is less frequent to resize an string versus an array, so we may not need the capacity field.
String can be used to serve as a container for passing return std::string in TVMRetValue.

Why the three containers

ADT is already used in vm so that it is more of a proposal to upgrade it to be more cross platform.
Array is currently being used in compiler but not in runtime, via support using std::vector, it might be interesting to see if we would like to bring it to runtime. Regardless of whether it should be in runtime, we might want to update to keep the signature in POD-C compatible style so that it can be cheaply accessed in another language
String is an interesting one. If we want to make the IR accessible in a full POD-C compatible fashion, then we need a string container that is not std::string, it might also benefit how we return strings from a PackedFunc.

Why POD-C compatible types

The main advantage is the cross language. Imagine we are building a pure C code generator(or llvm) that interacts with these container objects. If the array is hidden behind a std::vector, then we will have to call into a runtime function ADTGetElem in order to get the corresponding element. We can directly generate an accessor code to get the corresponding element by addressing.

Summary

In this RFC, we discussed design variants for runtime array containers and their trade-offs.

It would be great if we can have a discussion about:

Whether we want to introduce these container objects
(in particular String and Array) into runtime.
What choice do you like for each container type.

junrushao · November 1, 2019, 9:38pm

Having POD-C style signature does address many across ABI issues, but I am not confident that it addresses all of them.

In this very case, I am not sure about the correctness of mismatching malloc/dealloc across DLL boundary, i.e. ArrayAlloc. Do we have any guarantee that the memory allocated in one DLL must be freed in the same DLL?

tqchen · November 1, 2019, 9:43pm

Because all the objects have their own deleter (as in the Object protocol) the correct deleter will be called

junrushao · November 1, 2019, 9:57pm

If we do not do any memory allocation besides constructing the object, then the correct deleter does make such guarantee.

I was wondering, however, how about it when supporting push_back when the capacity is full? A potential solution would be call deleter of the original array, and create a new one to replace it inplace.

The complication here is whether we implement copy-on-write or always-copy semantics.

tqchen · November 1, 2019, 10:00pm

push_back/resize, we can use your proposal: create a new one, copy the content over, and call the deleter of the original one.

We can implement copy-on-write semantics. So if there are two refs to the same array, the other ref won’t be affected when we do modification.

junrushao · November 1, 2019, 10:05pm

Yup, so here comes one issue that I am not sure I could address: The deleter would delete the struct itself.

tqchen · November 1, 2019, 10:07pm

If the semantics is copy-on-write, then we are fine. Instead of calling the deleter to delete, we will simply decrease the ref counter. And the content won’t be deleted until the other reference is out of scope.

junrushao · November 1, 2019, 10:08pm

Gotcha. That makes a lot of sense. Thanks!

wweic · November 1, 2019, 10:26pm

Can you elaborate on the motivation of these containers? Do you want to unify the container used in compiler and runtime?

tqchen · November 1, 2019, 11:20pm

ADT is already used in vm so that it is more of a proposal to upgrade it to be more cross platform.
Array is currently being used in compiler but not in runtime, via support using std::vector, it might be interesting to see if we would like to bring it to runtime. Regardless of whether it should be in runtime, we might want to update to keep the signature in POD-C compatible style so that it can be cheaply accessed in another language
String is an interesting one. If we want to make the IR accessible in a full POD-C compatible fashion, then we need a string container that is not std::string, it might also benefit how we return strings from a PackedFunc.

Thanks @wweic for the comment, i also added the motivation to the beginning. Of course we do not need to bring all of them, and we can suggest which ones might be useful. Given that we do have a minimum runtime requirement.

wweic · November 2, 2019, 3:03am

Thanks @tqchen! Sounds interesting, this way, almost all the objects in c++ will be easily accessible in other languages.

junrushao · November 5, 2019, 12:58am

I tried to go through the pseudo code in the section “Array: An Example”. I think I could understand the basic idea, but probably missed something as well.

What I can understand:

ArrayObj and ArrayObj::FromStd. ArrayObj is the visible interface that contains only three POD fields. And its underlying implementation, ArrayObj::FromStd, which inherits from ArrayObj, is hidden inside somewhere.
ArrayFromVector. This is the creator of an ArrayObj, which essentially creates an instance of its child class ArrayObj::FromStd.
ArrayAlloc. This creates a contiguous buffer of memory, which has a ArrayObj header, as well as space for its elements indicated by cap. They are returned as a single ObjectPtr<StringObj>.

What I do not understand:

Is ArrayObj::FromStd::MakeString a friend method? Could you help elaborate its signature?
ArrayObj::FromStd::FromStr does do the copy of std::vector data, right? Otherwise, could we allow only move constructor from std::vector?
In ArrayFromVector, I think the return signature should be ObjectPtr<ArrayObj>. Otherwise, is there any implicit casting happening around? This question also applies to the return value of ArrayAlloc.

Memory layout of ArrayObj::FromStd. I have little knowledge of how the compiler would work or how they reorder memory layout, adjust address offsets of class members. I am not sure either ArrayObj::FromStd has a standard layout because it has std::vector inside.

Thank you so much!

tqchen · November 5, 2019, 2:56am

Thanks for the discussions, the ObjectPtr<StringObj> was a typo and i was intended mean ArrayObj. We only allow move constructor from std::vector.

Re: memory layout: ArrayObj::FromStd will have the same alignment and because it inherits ArrayObj. As long as we only look at the header part(in this case ArrayObj) it should be the same

haichen · November 7, 2019, 12:34am

I have one question about the tuple object as it’s the most common case. I assume the behavior of an ArrayObj is closer to vector as its size can change dynamically. So in the case of a tuple, should we use ADTObj with tag 0 or use ArrayObj?

tqchen · November 7, 2019, 1:21am

The current convention is to use ADT with tag 0

haichen · November 7, 2019, 1:29am

A follow up question. Do we want to support Array containers in graph runtime as well? Or just in VM?

tqchen · November 7, 2019, 3:17am

Depending on the possible usecases. We could allow array container in graph runtime, do you have something specific in mind?

FrozenGene · November 7, 2019, 5:19am

re string.
Seems that we want llvm::StringRef?

tqchen · November 7, 2019, 4:03pm

The String’s layout is same as llvm::StringRef(if we swap the size and data field). StringRef is a temporary implementation of string_view(which needs c++17)

The proposed String sub-classes the Object and is POD-C compatible self-managed container type, which means the same object can not only be used by c++. but can also be passed through other languages and construct from other languages.

tqchen · November 7, 2019, 10:27pm

Thanks for everyone’s thought. Seems we are converging. We start to implement each of them separately in runtime/container.h. Please see if you would like to take a stab at some of these.

@wweic given your previous experience in ADT, would you like to try to take a stab at this?