Discussion:
[Numpy-discussion] matmul needs some clarification.
Charles R Harris
2015-05-30 22:23:47 UTC
Permalink
Hi All,

The problem arises when multiplying a stack of matrices times a vector.
PEP465 defines this as appending a '1' to the dimensions of the vector and
doing the defined stacked matrix multiply, then removing the last dimension
from the result. Note that in the middle step we have a stack of matrices
and after removing the last dimension we will still have a stack of
matrices. What we want is a stack of vectors, but we can't have those with
our conventions. This makes the result somewhat unexpected. How should we
resolve this?

Chuck
Alexander Belopolsky
2015-06-03 19:38:19 UTC
Permalink
Post by Charles R Harris
The problem arises when multiplying a stack of matrices times a vector.
PEP465 defines this as appending a '1' to the dimensions of the vector and
doing the defined stacked matrix multiply, then removing the last dimension
from the result. Note that in the middle step we have a stack of matrices
and after removing the last dimension we will still have a stack of
matrices. What we want is a stack of vectors, but we can't have those with
our conventions. This makes the result somewhat unexpected. How should we
resolve this?
I think that before tackling the @ operator, we should implement the pure
dot of stacks of matrices and dot of stacks of vectors generalized ufuncs.
The first will have a 2d "core" and the second - 1d. Let's tentatively
call them matmul and vecmul. Hopefully matrix vector product can be
reduced to the vecmul,
but I have not fully figured this out. If not - we may need the third
ufunc.

Once we have these ufuncs, we can decide what @ operator should do in terms
of them and possibly some axes manipulation.
Charles R Harris
2015-06-03 21:12:48 UTC
Permalink
On Sat, May 30, 2015 at 6:23 PM, Charles R Harris <
Post by Charles R Harris
The problem arises when multiplying a stack of matrices times a vector.
PEP465 defines this as appending a '1' to the dimensions of the vector and
doing the defined stacked matrix multiply, then removing the last dimension
from the result. Note that in the middle step we have a stack of matrices
and after removing the last dimension we will still have a stack of
matrices. What we want is a stack of vectors, but we can't have those with
our conventions. This makes the result somewhat unexpected. How should we
resolve this?
dot of stacks of matrices and dot of stacks of vectors generalized ufuncs.
The first will have a 2d "core" and the second - 1d. Let's tentatively
call them matmul and vecmul. Hopefully matrix vector product can be
reduced to the vecmul,
but I have not fully figured this out. If not - we may need the third
ufunc.
The `@` operator is done. I originally started with four ufuncs, mulvecvec,
mulmatvec, etc, but decided to wait on that until we merged the ufunc and
multiarray packages and did some other ufunc work. The matmul function can
certainly be upgraded in the future, but is as good as dot right now except
it doesn't handle object arrays.
terms of them and possibly some axes manipulation.
Chuck
Alexander Belopolsky
2015-06-05 00:50:56 UTC
Permalink
but is as good as dot right now except it doesn't handle object arrays.
This is a fairly low standard. :-(
Charles R Harris
2015-06-05 01:33:14 UTC
Permalink
On Wed, Jun 3, 2015 at 5:12 PM, Charles R Harris <
but is as good as dot right now except it doesn't handle object arrays.
This is a fairly low standard. :-(
Meaning as fast. I expect ufuncs to have more call overhead and they need
to use blas to be competitive for float et al.

Chuck

Stephan Hoyer
2015-06-03 20:25:57 UTC
Permalink
Post by Charles R Harris
The problem arises when multiplying a stack of matrices times a vector.
PEP465 defines this as appending a '1' to the dimensions of the vector and
doing the defined stacked matrix multiply, then removing the last dimension
from the result. Note that in the middle step we have a stack of matrices
and after removing the last dimension we will still have a stack of
matrices. What we want is a stack of vectors, but we can't have those with
our conventions. This makes the result somewhat unexpected. How should we
resolve this?
I'm afraid I don't quite understand the issue. Maybe a more specific
example of the shapes you have in mind would help? Here's my attempt.

Suppose we have two arrays:
a with shape (i, j, k)
b with shape (k,)

Following the logic you describe from PEP465, for a @ b we have shapes
transform like so:
(i, j, k,) @ (k, 1) -> (i, j, 1) -> (i, j)

This makes sense to me as a stack of vectors, as long as you are imagining
the original stack of matrices as along the first dimension. Which I'll
note is the default behavior for the new np.stack (
https://github.com/numpy/numpy/pull/5605).
Charles R Harris
2015-06-03 21:08:58 UTC
Permalink
On Sat, May 30, 2015 at 3:23 PM, Charles R Harris <
Post by Charles R Harris
The problem arises when multiplying a stack of matrices times a vector.
PEP465 defines this as appending a '1' to the dimensions of the vector and
doing the defined stacked matrix multiply, then removing the last dimension
from the result. Note that in the middle step we have a stack of matrices
and after removing the last dimension we will still have a stack of
matrices. What we want is a stack of vectors, but we can't have those with
our conventions. This makes the result somewhat unexpected. How should we
resolve this?
I'm afraid I don't quite understand the issue. Maybe a more specific
example of the shapes you have in mind would help? Here's my attempt.
a with shape (i, j, k)
b with shape (k,)
This makes sense to me as a stack of vectors, as long as you are imagining
the original stack of matrices as along the first dimension. Which I'll
note is the default behavior for the new np.stack (
https://github.com/numpy/numpy/pull/5605).
Yes, you end up with a stack of vectors, but matmul will interpret them as
a stack of matrices. I decided there is nothing to be done there and just
documented it as a potential gotcha. The other possibility would be to
prohibit or warn on stacked matrices and vectors for the `@` operator and
that might limit what some folks want to do.

Chuck
Loading...