[RFC][Relay][Topi] Hashtable support

As in this PR, we propose to add HashTable support into TVM.

We observe a great amount of TensorFlow NLP models used in our production environment, and we would like to improve model serving speed using TVM. Among these models, a lot of them incorporate HashTable related operators for word/character hashing, convert DT_STRING to integer type, and do further embedding/feature extraction in later steps.

In order to support HashTable related operations in TVM, four basic operators in TensorFlow need to be supported:

  • HashTableV2
  • InitializeTableFromTextFileV2
  • LookupTableImportV2
  • LookupTableFindV2

We consider four main issues in the proposed implementation:

  1. How to represent the HashTable object?

    We build a HashTable class and assign its pointer to the data of a DLTensor, so as to make a DLTensor with HashTable element. By doing this, we can smoothly inherit the current TVM structure where operators interact through DLTensors. In TensorFlow graph, the HashTable appears at the input and output tensors of the HashTable related operators. Hence, this proposed solution makes it natural to import from the TensorFlow graph. Moreover, by embedding HashTable inside DLTensor, some existing DLTensor related methods can be inherited, such as setting up tensors, passing tensors as arguments and even the AllocTensor instruction in VM. We build the HashTable class as an external library in src/runtime/contrib like the sort library.

  2. How to handle the tensors with string type?

    For HashTable operations, the input and output tensors are usually with string type. Currently, TVM doesn’t support string type data inside DLTensors. We assign the pointer of c++ class std::string to the data of a DLTensor, to make a string type tensor. Although TVM currently supports using Object class to pass an individual string data, but it doesn’t support an array of strings inside a DLTensor. Also, we believe it will not be a good way to further import Object class inside the DLTensor class.

  3. How to handle the table initializer operators?

    An important issue for HashTable related operators is how to do table initialization. The TensorFlow operators LookupTableImportV2 and InitializeTableFromTextFileV2 are designed for this purpose. In TensorFlow, an init run “tf.table_initializer()” is required before executing the model. Similar to TensorFlow, we add an init run into the TVM graph runtime, the ops for initialization will be executed in this init run. The init run will be executed with the “set_input(**params)” when setting up the parameters of the graph runtime. Then in the regular run, only the other ops will be executed. To support VM runtime, we think an “init” function which consists of initialization ops can be invoked in the same way before invoking the “main” model function, though currently we only implement the graph runtime support.

    Alternatively, we can treat the HashTable object as a constant parameter. When parsing the model, we can make the HashTable object and complete the required initialization. Then the HashTable will be a constant variable. In this way, both the graph runtime and VM runtime can avoid being modified and only the table lookup operators like LookupTableFindV2 will remain in the model. But this solution will make it hard to handle possible run-time modifications to the HashTable, thus is less flexible. Also, we need to handle the initialization ops before performing model serving.

  4. How to transform between Numpy Array and NDArray with string elements?

    The inputs and params should be transformed from Numpy Array to the DLTensor in the NDArray in TVM. “CopyFrom” and “AsNumpy” methods of NDArray handle the transformation. But for the array with strings, the original transformation cannot work. We should add new transformation methods for string arrays, which is the reason why new “_ffi” functions beside “TVMArrayCopyFromBytes” and “TVMArrayCopyToBytes” are needed.

In addition, to add new datatypes for DLTensors, we rely on the recommended way of adding custom data types as in this PR. Our proposed implementation uses the recommended way of adding external operations. The HashTable related operations are first built as an external library, which is invoked by the TOPI layer to form operators. The TensorFlow frontend parses HashTable related operators from TensorFlow graph as registered Relay operators. Besides graph runtime, such an implementation of ops can also smoothly fit the VM runtime with the DLTensor based HashTable and string types, since the ops can be handled by the InvokePacked instruction and the new data types are still NDArray object with DLTensor in VM.

Any discussions are welcome! @tqchen @FrozenGene

2 Likes

Thanks for proposing this RFC. One quick question:

We build the HashTable class as an external library in src/runtime/contrib like the sort library.

If we do it so, how about other hardware support besides CPU, for example GPU?

Re: how to support objects like HashTable and strings.

While DLTensor does not support strings and additional data types. We started to introduce runtime::Object, which allows us to introduce various objects into the runtime system. See the ongoing PR to support String https://github.com/apache/incubator-tvm/pull/4628

So I would recommend rethink the approach to make use the Object system, instead of the custom data type to support these additional data structures

1 Like

Since the hashtable operators are special with mainly lookup operations, they are normally executed on CPU. We think currently we can only support the hashtable related execution on CPU.

Yeah, I agree that the better way may be using the runtime::Object to support String instead of custom data type as a PR to be merged back. Maybe we can consider using runtime::Object to support the String, and then embedding such String object inside the data of DLTensor. Yeah, I also agree that it is better to reuse the current on-going String support rather than redo it as a custom data type.

Also, how about the concern about the modifications to the runtime for handling the table initialization ops? Currently, we add an additional step in graph runtime “init_exec”, to run the ops for table initialization before model serving. In graph runtime building, we separate the ops to two groups: for model serving and for initialization. During the “init_exec”, we only run the initialization ops, while during the “run”, we only run the model serving ops. This solution needs modification to graph runtime. I wonder your opinion on how to handle the table initialization ops? And what do you think of our current solution? Thank you so much for your valuable suggestions!

We don’t need to put Object into the data field of DLTensor. Instead, we will also introduce an Array container object that can contain a list of objects to resolve the List[HashTable] issue

Thank you for the information. Is the Array container object already supported in the current release?

@lfengad Could we follow up this? Because we have supported String and Array now.

I am working on a POD-C compliant hash map [link], which might be helpful some how. However, representing those stuff in Relay IR is still challenging.

I encountered similar issue with HashTable for TensorFlow :

NotImplementedError: The following operators are not implemented: {'DivNoNan', 'LookupTableImportV2', 'HashTableV2', 'AsString', 'StringToHashBucketFast', 'LookupTableFindV2', 'LookupTableSizeV2'}

and then noticed this RFC discussion thread. @lfengad @tqchen wondering if we have any good solution for LookupTable? I checked several of my models, seems these are very common among all my models, mostly introduced by embedding/TensorFlow-Transform.