How do I properly work with boolean in TIR？

bowenliu · September 29, 2022, 5:18am

Hello, I’m working with Tir and Relay. Now I want to deploy 1bit net in the tir lower procedure, but when I used Bool as the DataDType, I encountered some problems. The Storage Flatten Pass convert Bool to Int(8), which caused me to lose 1 bit information.

  // TODO(Lunderberg): Move the handling of boolean into a
  // dedicated pass.

  // Boolean tensors are backed by a Int8 array.
  if (e.flattened_buffer->dtype == DataType::Bool()) {
    auto writer = e.flattened_buffer.CopyOnWrite();
    writer->dtype = DataType::Int(8);
  }

···

I found that using the int1 type can retain 1 bit information in Tir. However, in Ndarray, “bits=1” is automatically converted to bool type, so I have to insert a pass in the relay phase to convert bool to int1. In this way, I can get the desired Tir form, and use the following to carry out the intrinsic pipeline and codegen. The Tir like this:

primfn(...)
buffers = {placeholder: Buffer(placeholder_20: Pointer(int1), int1, [43008], [], elem_offset=10752),
                Conv2dOutput: Buffer(Conv2dOutput_2: Pointer(int1), int1, [86016], [])}
{
  allocate(placeholder.global_0: Pointer(global int1), int1, [9216, 2240]), storage_scope = global;
  allocate(Conv2dOutput.global_0: Pointer(global int1), int1, [20480, 3392]), storage_scope = global {
  }
}

So is it reasonable for me to do this? And what will happen to Tir 1-bit processing later?

yzh119 · September 29, 2022, 6:26pm

I don’t think using boolean is the correct way to deal with binary networks, actually C++'s bool type is stored in 8bits, and PyTorch’s BoolTensor is also stored in 8bits per element. Currently, TVM still uses the ordinary uint32 type to store binary data: https://github.com/apache/tvm/blob/2379917985919ed3918dc12cad47f469f245be7a/python/tvm/topi/nn/bnn.py

junrushao · September 30, 2022, 2:52pm

Yep. Generally speaking, usually subword data types should be padded to 8bit (if TVM hasn’t intentionally compacted the storage) so that efficient random access is possible

bowenliu · October 8, 2022, 7:17am

Thanks! LGTM. TVM uses the ordinary uint32 type to store binary data. In this case, the data must be divisible by 32, and now I intend to use uint8 type to represent binary data. The example in the Hyperlink is for the TE layer. I want to start this work on the relay layer. Therefore, I need to add pass to relay to pack 1-bit data into uint8. Do you have any better suggestions?

yzh119 · October 9, 2022, 9:38pm

Pack 1-bit data to uint8/uint32 is an option.

Currently there is no 1-bit solution in TVM like bitset in C++, and you are welcomed to create an RFC to discuss how to support it (you need to customize codegen for memory load/store, storage alignment issue etc.).

To be honest I still think packing several(8/16/32) 1-bits together is the best solution, for both hardware efficiency and engineering effort.