Discussion:
[Numpy-discussion] ENH: compute many inner products quickly
Mark Daoust
2016-06-06 00:08:32 UTC
Permalink
Here's the einsum version:

`es = np.einsum('Na,ab,Nb->N',X,A,X)`

But that's running ~45x slower than your version.

OT: anyone know why einsum is so bad for this one?

Mark Daoust
I recently ran into an application where I had to compute many inner
products quickly (roughy 50k inner products in less than a second). I
My first instinct was to look for a NumPy function to quickly compute
this, such as np.inner. However, it looks like np.inner has some other
behavior and I couldn’t get tensordot/einsum to work for me.
Then a labmate pointed out that I can just do some slick matrix
I opened [a PR] with this, and proposed that we define a new function
called `inner_prods` for this.
The main challenge is to figure out how to transition the behavior of
all these operations, while preserving backwards compatibility. Quite
likely, we need to pick new names for these functions, though we should try
to pick something that doesn't suggest that they are second class
alternatives.
Do we choose new function names? Do we add a keyword arg that changes what
np.inner returns?
[a PR]:https://github.com/numpy/numpy/pull/7690
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Stephan Hoyer
2016-06-06 00:44:54 UTC
Permalink
Post by Mark Daoust
`es = np.einsum('Na,ab,Nb->N',X,A,X)`
But that's running ~45x slower than your version.
OT: anyone know why einsum is so bad for this one?
I think einsum can create some large intermediate arrays. It certainly
doesn't always do multiplication in the optimal order:
https://github.com/numpy/numpy/pull/5488
CJ Carey
2016-06-06 01:08:59 UTC
Permalink
A simple workaround gets the speed back:


In [11]: %timeit (X.T * A.dot(X.T)).sum(axis=0)
1 loop, best of 3: 612 ms per loop

In [12]: %timeit np.einsum('ij,ji->j', A.dot(X.T), X)
1 loop, best of 3: 414 ms per loop


If working as advertised, the code in gh-5488 will convert the
three-argument einsum call into my version automatically.
Post by Stephan Hoyer
Post by Mark Daoust
`es = np.einsum('Na,ab,Nb->N',X,A,X)`
But that's running ~45x slower than your version.
OT: anyone know why einsum is so bad for this one?
I think einsum can create some large intermediate arrays. It certainly
https://github.com/numpy/numpy/pull/5488
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Loading...