Skip to content Skip to sidebar Skip to footer

Why Numpy Has Dimension (n,) Instead Of (n,1) Only

I have been curious about this for some time. I can live with that, but it always bites me when enough care is not taken, so I decide to post it here. Suppose the following example

Solution 1:

numpy's philosphy is not that a[:, 0] is a "column vector" and a[0, :] a "row vector" in the general case. Rather they are both, quite simply, vectors—i.e. arrays with one and only one dimension. This is actually highly logical and consistent (but yes, can get annoying for those of us accustomed to Matlab).

I say "in the general case" because that is true for numpy's most general data structure, the array, which is intended for all kinds of multi-dimensional dense data storage and manipulation applications—not just matrix math. Having "rows" and "columns" is a highly specialized context for array operations—but yes, a very common one: that's why numpy also supplies the matrix class. Convert your array to a numpy.matrix (or use the matrix constructor instead of array to begin with) and you will see behaviour closer to what you expect. For more information, see What are the differences between numpy arrays and matrices? Which one should I use?

For cases where you're dealing with more than 2 dimensions, take a look at the numpy.expand_dims function. Though the syntax is annoyingly redundant and unpythonically verbose, when I'm working on arrays with more than 2 dimensions (so cannot use matrix), I'm forever having to use expand_dims to do this kind of thing:

A -= numpy.expand_dims( A.mean( axis=2 ), 2 )   # subtract mean-across-layers from A

instead of

A -= A.mean( axis=2 )   # throw an exception while naively attempting to subtract mean-across-layers from A

But consider Matlab, by contrast. Matlab implicitly asserts that there is no such thing as a one-dimensional object and that the minimum number of dimensions a thing can ever have is 2. Sure, you and I are both highly accustomed to this, but take a moment to realize how arbitrary it is. There is clearly a conceptual difference between a fundamentally one-dimensional object, and a two-dimensional object that just happens to have extent 1 in one of its dimensions: the latter is allowed to grow in its second dimension, whereas the former doesn't even know what the second dimension means—and why should it? Hence a.shape==(N,) and a.shape==(N,1) make perfect sense as separate cases. You might as well ask "why is it not (N, 1, 1)?" or "why is it not (N, 1, 1, 1, 1, 1, 1)?"

Post a Comment for "Why Numpy Has Dimension (n,) Instead Of (n,1) Only"