Shape inference logic of dense op

I was wondering about the shape inference logic of dense op. Usually in a network, a flatten op precedes the dense op, so the input of the dense op is 2D. If there is no real flatten op in the network before dense op, and the input data X is 4D, it should be first flattened implicitly before applying multiplication XW^T. But in the 4D case, it seems the logic here is not consistent with what has been documented and implemented:

- **data**: `(x1, x2, ..., xn, input_dim)`
- **weight**: `(units, input_dim)`
- **bias**: `(units,)`
- **out**: `(x1, x2, ..., xn, units)`

Per documentation and implementation, an input data of shape (32, 3, 224, 224) with units=10, would have an output shape (32, 3, 224, 10), which does not seem to be correct.

Am I misinterpreting something here?

The input is supposed to be flattened before dense op. In topi implementation, the input to dense op is 2D. I think the doc of corresponding nnvm op is wrong.

Thank you, @masahi, for the reply. In this case, I will change the doc and add some checks in the current shape inference logic to provide clearer error messages. I will submit a PR fixing this along with a project I’m working on. Thanks.