Discussion:
[Numpy-discussion] array comprehension
Neal Becker
2016-11-04 12:06:32 UTC
Permalink
I find I often write:
np.array ([some list comprehension])

mainly because list comprehensions are just so sweet.

But I imagine this isn't particularly efficient.

I wonder if numpy has a "better" way, and if not, maybe it would be a nice
addition?
Francesc Alted
2016-11-04 12:26:10 UTC
Permalink
Post by Neal Becker
np.array ([some list comprehension])
mainly because list comprehensions are just so sweet.
But I imagine this isn't particularly efficient.
Right. Using a generator and np.fromiter() will avoid the creation of the
intermediate list. Something like:

np.fromiter((i for i in range(x))) # use xrange for Python 2
Post by Neal Becker
I wonder if numpy has a "better" way, and if not, maybe it would be a nice
addition?
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
Francesc Alted
Neal Becker
2016-11-04 13:36:02 UTC
Permalink
Post by Francesc Alted
Post by Neal Becker
np.array ([some list comprehension])
mainly because list comprehensions are just so sweet.
But I imagine this isn't particularly efficient.
Right. Using a generator and np.fromiter() will avoid the creation of the
np.fromiter((i for i in range(x))) # use xrange for Python 2
Does this generalize to >1 dimensions?
Francesc Alted
2016-11-04 14:12:11 UTC
Permalink
Post by Neal Becker
Post by Francesc Alted
Post by Neal Becker
np.array ([some list comprehension])
mainly because list comprehensions are just so sweet.
But I imagine this isn't particularly efficient.
Right. Using a generator and np.fromiter() will avoid the creation of
the
Post by Francesc Alted
np.fromiter((i for i in range(x))) # use xrange for Python 2
Does this generalize to >1 dimensions?
A reshape() is not enough? What do you want to do exactly?
Post by Neal Becker
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
Francesc Alted
Stephan Hoyer
2016-11-04 15:04:13 UTC
Permalink
Post by Neal Becker
Does this generalize to >1 dimensions?
A reshape() is not enough? What do you want to do exactly?
np.fromiter takes scalar input and only builds a 1D array. So it actually
can't combine multiple values at once unless they are flattened out in
Python. It could be nice to add support for non-scalar inputs, stacking
them similarly to np.array. Likewise, it could be nice to add an axis
argument, so it can work similarly to np.stack.

More generally, you might want to iterate and rebuild over arbitrary
dimension(s) of an array. Something like
np.stack([x for x in np.unstack(y, axis)], axis)

But, we also don't have an unstack function. This would mostly be syntactic
sugar, but I think it would be a nice addition. Such a function actually
exists in TensorFlow:
https://g3doc.corp.google.com/third_party/tensorflow/g3doc/api_docs/python/array_ops.md?cl=head#unstack
Daπid
2016-11-04 15:11:51 UTC
Permalink
Post by Stephan Hoyer
But, we also don't have an unstack function. This would mostly be syntactic
sugar, but I think it would be a nice addition. Such a function actually
https://g3doc.corp.google.com/third_party/tensorflow/g3doc/api_docs/python/array_ops.md?cl=head#unstack
That link is behind a login wall. This is the public version:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/array_ops.md
Ryan May
2016-11-04 15:33:22 UTC
Permalink
Post by Stephan Hoyer
Post by Neal Becker
Does this generalize to >1 dimensions?
A reshape() is not enough? What do you want to do exactly?
np.fromiter takes scalar input and only builds a 1D array. So it actually
can't combine multiple values at once unless they are flattened out in
Python. It could be nice to add support for non-scalar inputs, stacking
them similarly to np.array. Likewise, it could be nice to add an axis
argument, so it can work similarly to np.stack.
itertools.product, itertools.permutation, etc. with np.fromiter (and
reshape) is probably also useful here, though it doesn't solve the
non-scalar problem.

Ryan
--
Ryan May
Neal Becker
2016-11-04 15:56:39 UTC
Permalink
Post by Francesc Alted
Post by Neal Becker
Post by Francesc Alted
Post by Neal Becker
np.array ([some list comprehension])
mainly because list comprehensions are just so sweet.
But I imagine this isn't particularly efficient.
Right. Using a generator and np.fromiter() will avoid the creation of
the
Post by Francesc Alted
np.fromiter((i for i in range(x))) # use xrange for Python 2
Does this generalize to >1 dimensions?
A reshape() is not enough? What do you want to do exactly?
I was thinking about:
x = np.array ([[L1] L2]) where L1,L2 take the form of a list comprehension,
as a means to create a 2-D array (in this example)
Robert Kern
2016-11-04 15:59:23 UTC
Permalink
Post by Neal Becker
Post by Francesc Alted
Post by Neal Becker
np.array ([some list comprehension])
mainly because list comprehensions are just so sweet.
But I imagine this isn't particularly efficient.
Right. Using a generator and np.fromiter() will avoid the creation of the
np.fromiter((i for i in range(x))) # use xrange for Python 2
Does this generalize to >1 dimensions?
No.

--
Robert Kern
Nathaniel Smith
2016-11-04 17:24:22 UTC
Permalink
Are you sure fromiter doesn't make an intermediate list or equivalent? It
has to collect all the values before it can know the shape or dtype of the
array to put them in.
Post by Neal Becker
np.array ([some list comprehension])
mainly because list comprehensions are just so sweet.
But I imagine this isn't particularly efficient.
Right. Using a generator and np.fromiter() will avoid the creation of the
intermediate list. Something like:

np.fromiter((i for i in range(x))) # use xrange for Python 2
Post by Neal Becker
I wonder if numpy has a "better" way, and if not, maybe it would be a nice
addition?
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
Francesc Alted
Stephan Hoyer
2016-11-04 17:32:05 UTC
Permalink
Post by Nathaniel Smith
Are you sure fromiter doesn't make an intermediate list or equivalent? It
has to collect all the values before it can know the shape or dtype of the
array to put them in.
fromiter dynamically resizes a NumPy array, like a Python list, except with
a growth factor of 1.5 (rather than 1.25):
https://github.com/numpy/numpy/blob/bb59409abf5237c155a1dc4c4d5b31e4acf32fbe/numpy/core/src/multiarray/ctors.c#L3721
Nathaniel Smith
2016-11-04 17:36:06 UTC
Permalink
Post by Stephan Hoyer
Post by Nathaniel Smith
Are you sure fromiter doesn't make an intermediate list or equivalent?
It has to collect all the values before it can know the shape or dtype of
the array to put them in.
Post by Stephan Hoyer
fromiter dynamically resizes a NumPy array, like a Python list, except
with a growth factor of 1.5 (rather than 1.25):
https://github.com/numpy/numpy/blob/bb59409abf5237c155a1dc4c4d5b31e4acf32fbe/numpy/core/src/multiarray/ctors.c#L3721


Oh, right, and the dtype argument is mandatory, which is what makes this
possible.

-n
Chris Barker
2016-11-04 20:08:44 UTC
Permalink
Post by Nathaniel Smith
Post by Stephan Hoyer
fromiter dynamically resizes a NumPy array, like a Python list, except
with a growth factor of 1.5
Oh, right, and the dtype argument is mandatory, which is what makes this
possible.
Couldn't it determine the dtype from the first element, and then barf later
if an incompatible one shows up?

And then we could adapt this code to np.array() and get nice performance
with no extra functions to think about calling...

And off the top of my head, I can't think of why it couldn't be generalized
to the nd case as well.

-CHB
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
Loading...