Is increased-strided DLTensor stored in column-major?

declmal · September 26, 2020, 9:42am

I’m just beginning to learn tvm, please correct me if anything wrong.

The equivalent question is: Is decreased-strided DLTensor stored in row-major?

For example, let’s say a compact 3d tensor of shape (4,3,2). If it’s stored in row major, the strides will be (6,2,1).

I’m really confused about the code here: ./src/runtime/contrib/cblas/gemm_utils.h

where C = matmul(A, B) is encapsulated

In function CallGemm:

// Reversed strides indicates an in-place transpose operation.
inline bool IsInPlaceTransposed(DLTensor* tensor) {
  return tensor->strides && (tensor->strides[1] > tensor->strides[0]);
}

template <typename TGemmOp>
inline void CallGemm(TVMArgs args, TVMRetValue* ret, TGemmOp op) {
  DLTensor* A = args[0];
  DLTensor* B = args[1];
  DLTensor* C = args[2];
  bool transa = args[3];
  bool transb = args[4];
  int bit_depth = sizeof(typename TGemmOp::TDatatype) * 8;
  CHECK_EQ(A->ndim, 2);
  CHECK_EQ(B->ndim, 2);
  CHECK_EQ(C->ndim, 2);

  CHECK_EQ(ElementStride(A), 1);
  CHECK_EQ(ElementStride(B), 1);
  CHECK_EQ(ElementStride(C), 1);

  // C can never be transposed.
  CHECK(!IsInPlaceTransposed(C));

  // Reversed strides indicates an in-place transpose operation.
  transa = IsInPlaceTransposed(A) ? !transa : transa;
  transb = IsInPlaceTransposed(B) ? !transb : transb;

  CHECK(TypeMatch(B->dtype, kDLFloat, bit_depth));
  CHECK(TypeMatch(C->dtype, kDLFloat, bit_depth));
  double alpha = args.size() > 5 ? args[5] : 1.0;
  double beta = args.size() > 6 ? args[6] : 0.0;
  op(transb, transa, ColumnCount(B, transb), RowCount(A, transa), ColumnCount(A, transa),
     static_cast<typename TGemmOp::TDatatype>(alpha),
     reinterpret_cast<typename TGemmOp::TDatatype*>(static_cast<char*>(B->data) + B->byte_offset),
     ColumnStride(B),
     reinterpret_cast<typename TGemmOp::TDatatype*>(static_cast<char*>(A->data) + A->byte_offset),
     ColumnStride(A), static_cast<typename TGemmOp::TDatatype>(beta),
     reinterpret_cast<typename TGemmOp::TDatatype*>(static_cast<char*>(C->data) + C->byte_offset),
     ColumnStride(C));
}

I wonder whether these two checks are redundant?

  // Reversed strides indicates an in-place transpose operation.
  transa = IsInPlaceTransposed(A) ? !transa : transa;
  transb = IsInPlaceTransposed(B) ? !transb : transb;

Since A and B here must be row-majored (B in op call is ahead of A, which is a trick for using col-majored matmul on row-majored stored A and B), A and B must have decreased strides.

That is to say, A->stride[0] must be greater than A->stride[1]. So, IsInPlaceTransposed(A) must be True.

declmal · September 26, 2020, 11:17am

I found in wiki link, that by definition, a row-major tensor should have decreased strides.

So, A or B here can either be row-major or column-major, that’s why we need IsInPlaceTransposed . (Although in tvm, tensors are most commonly stored in row-major.)

Finally I understand that my confusion actually comes from the necessity of the arguments transa and transb in ./python/tvm/contrib/cblas.py.

Since A is DLTensor*, I think transa can be determined by A->strides:

if A->strides[0] > A->strides[1], A is inferred to be row major, transa=False.
otherwise, A is inferred to be column major, transa=True.

So as to transb.

PS.

In C interface of the CBLAS, the matmul interface is given as follows:

void cblas_sgemm(const CBLAS_LAYOUT Layout, const CBLAS_TRANSPOSE TransA,
                               const CBLAS_TRANSPOSE TransB, const MKL_INT M, const MKL_INT N,
                               const MKL_INT K, const float alpha, const float *A,
                               const MKL_INT lda, const float *B, const MKL_INT ldb,
                               const float beta, float *C, const MKL_INT ldc) NOTHROW;

we need TransA and TransB here because A and B here are just pointers instead of DLTensor*, there is no clue whether A is column-major or row-major.

tqchen · September 26, 2020, 4:36pm

DLTensor requires that the data stores in row major and all the logics are based on the row major assumption.

Most CBLAS interface assumes the input is col major if i recall correctly.

declmal · September 27, 2020, 12:56am

Thanks for clearance @tqchen, the code is much more readable to me now.