[Relay] [NN] Does relay.nn.dense supports multi-dimensional input?

lygztq · June 29, 2021, 8:29am

The doc says relay.nn.dense can have input tensors with shape (d_1, d_2, …, d_n, units_in). However when I use a 3d tensor as the input of dense, I get this error:

  File "/root/anaconda3/envs/tvm0.8-dev/lib/python3.7/site-packages/tvm-0.8.dev1067+g2cde3dc0e-py3.7-linux-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py", line 81, in cfun
    rv = local_pyfunc(*pyargs)
  File "/root/anaconda3/envs/tvm0.8-dev/lib/python3.7/site-packages/tvm-0.8.dev1067+g2cde3dc0e-py3.7-linux-x86_64.egg/tvm/relay/op/strategy/generic.py", line 726, in _compute_dense
    return [topi_compute(*args)]
  File "/root/anaconda3/envs/tvm0.8-dev/lib/python3.7/site-packages/tvm-0.8.dev1067+g2cde3dc0e-py3.7-linux-x86_64.egg/tvm/autotvm/task/topi_integration.py", line 162, in wrapper
    node = topi_compute(cfg, *args)
  File "/root/anaconda3/envs/tvm0.8-dev/lib/python3.7/site-packages/tvm-0.8.dev1067+g2cde3dc0e-py3.7-linux-x86_64.egg/tvm/topi/x86/dense.py", line 215, in dense_pack
    M, K = get_const_tuple(data.shape)  # batch, in_dim
ValueError: too many values to unpack (expected 2)

I’m confused. Does relay.nn.dense currently only supports 2d tensor as the input?

jcf94 · June 29, 2021, 8:54am

Emm … This seems to be a flaky problem.

The op representation of dense in relay support multi-dim(exp. doc string, shape functions), while the current computation(exp. topi.nn.dense) does not.

I guess that dense op is desinged to support multi-dim, but guys only added simpler computations and schedules which only support 2-dim to topi.

lygztq · June 29, 2021, 9:52am

Thanks for the reply. I also find relay.nn.batch_matmul has a similar problem (no transpose for y when the target is llvm, but with transpose when the target is cuda and use cuBLAS).

AndrewZhaoLuo · July 1, 2021, 6:39pm

Ah yes, in general I’ve noticed some schedules do not really follow the specification of the operation. I would personally open an issue. I’ve noticed this in the past.

jcf94 · July 2, 2021, 2:32am

That will be really great!

I would like to help fix these if I have time. I’ve just added a nn.matmul to support the input tensors in transposed or non-transposed format.

AndrewZhaoLuo · July 6, 2021, 9:27pm

I opened an issue here: [Topi] Allow relay.nn.dense support arbitrary number dimensions · Issue #8412 · apache/tvm · GitHub

I’ll try to fix it if I have time this week.

masahi · October 14, 2021, 9:07am

Hi folks, an investigation of a different issue led me to this post. Since we cannot run nn.dense on dimension > 2, our frontends add reshape before and after nn.dense. But our op fusion pass doesn’t fuse reshape with dense, so we cannot fuse dense with activation ops that follow it anymore due to the annoying reshape in the middle.

In particular, when we import huggingface transformer models, most of dense ops are not fused with elemwise ops at all, so we end up with something like

  ...
  %987 = fn (%p0420: Tensor[(1024, 1024), float16], %p1305: Tensor[(4096, 1024), float16], Primitive=1, hash="c13735290dc46bbc") -> Tensor[(1024, 4096), float16] {
    nn.dense(%p0420, %p1305, units=None, out_dtype="float16") /* ty=Tensor[(1024, 4096), float16] */
  };
  %988 = %987(%986, meta[relay.Constant][16] /* ty=Tensor[(4096, 1024), float16] */) /* ty=Tensor[(1024, 4096), float16] */;
  %989 = fn (%p0419: Tensor[(1024, 4096), float16], %p1304: Tensor[(4096), float16], %p2142: float16, Primitive=1, hash="ab37ab7bd1a05f99") -> Tensor[(1024, 4096), float16] {
    %887 = reshape(%p0419, newshape=[8, 128, 4096]) /* ty=Tensor[(8, 128, 4096), float16] */;
    %888 = add(%887, %p1304) /* ty=Tensor[(8, 128, 4096), float16] */;
    %889 = multiply(%888, %p2142) /* ty=Tensor[(8, 128, 4096), float16] */;
    %890 = cast(%889, dtype="float32") /* ty=Tensor[(8, 128, 4096), float32] */;
    %891 = erf(%890) /* ty=Tensor[(8, 128, 4096), float32] */;
    %892 = multiply(%891, 0.5f /* ty=float32 */) /* ty=Tensor[(8, 128, 4096), float32] */;
    %893 = cast(%888, dtype="float32") /* ty=Tensor[(8, 128, 4096), float32] */;
    %894 = add(0.5f /* ty=float32 */, %892) /* ty=Tensor[(8, 128, 4096), float32] */;
    %895 = multiply(%893, %894) /* ty=Tensor[(8, 128, 4096), float32] */;
    %896 = reshape(%895, newshape=[-1, 4096]) /* ty=Tensor[(1024, 4096), float32] */;
    cast(%896, dtype="float16") /* ty=Tensor[(1024, 4096), float16] */
  };
 ...

So it is very important that we fix this issue.