Post by Benjamin RootJust pointing out np.loadtxt(..., ndmin=2) will always return a 2D
array. Notice that without that option, the result is effectively
squeezed. So if you don't specify that option, and you load up a CSV
file with only one row, you will get a very differently shaped array
than if you load up a CSV file with two rows.
Oh, well I personally think that default squeeze is an abomination :).
logics, and we have to pick one.
likely better.
general.
Anyway, I *really* do not have an opinion about what is better.
objects or array_interface stuff. Which in this case is really
unnecessary I think.
Post by Benjamin RootBen Root
On Tue, Nov 10, 2015 at 10:07 AM, Irvin Probst
Actually, it is the "sequence special case" type ;).
(matlab does not
have this, since matlab always returns 2-D I
realized).
As I said, if usecols is like indexing, the result
arr = np.loadtxt(f)
arr = arr[usecols]
in which case a 1-D array is returned if you put in a
scalar into
usecols (and you could even generalize usecols to
higher dimensional
array-likes).
The way you implemented it -- which is fine, but I
want to stress that
there is a real decision being made here --, you
always see it as a
sequence but allow a scalar for convenience (i.e.
always return a 2-D
array). It is a `sequence of ints or int` type
argument and not an
array-like argument in my opinion.
The first one is whether loadtxt should always return a 2D
array or should it match the shape of the usecol argument.
From a CS guy point of view I do understand your concern here.
Now from a teacher point of view I know many people expect to
get a "matrix" (thank you Matlab...) and the "purity" of
matching the dimension of the usecol variable will be seen by
many people [1] as a nerdy useless heavyness noone cares of
(no offense). So whatever you, seadoned numpy devs from this
mailing list, decide I think it should be explained in the
docstring with a very clear wording.
My own opinion on this first problem is that loadtxt() should
always return a 2D array, no less, no more. If I write
np.loadtxt(f)[42] it means I want to read the whole file and
then I explicitely ask for transforming the 2-D array
loadtxt() returned into a 1-D array. Otoh if I write
loadtxt(f, usecol=42) it means I don't want to read the other
columns and I want only this one, but it does not mean that I
want to change the returned array from 2-D to 1-D. I know this
new behavior might break a lot of existing code as
usecol=(42,) used to return a 1-D array, but
usecol=((((42,)))) also returns a 1-D array so the current
behavior is not consistent imho.
The second problem is about the wording in the docstring, when
I see "sequence of int or int" I uderstand I will have to cast
into a 1-D python list whatever wicked N-dimensional object I
use to store my column indexes, or hope list(my_object) will
do it fine. On the other hand when I read "array-like" the
function is telling me I don't have to worry about my object,
as long as numpy knows how to cast it into an array it will be
fine.
import numpy as np
a=[[[2,],[],[],],[],[],[]]
foo=np.loadtxt("CONCARNEAU_2010.txt", usecols=a)
should just work and return me a 2-D (or 1-D if you like)
array with the data I asked for and I don't think "a" here is
an int or a sequence of int (but it's a good example of why
loadtxt() should not match the shape of the usecol argument).
To make it short, let the reading function read the data in a
consistent and predictible way and then let the user
explicitely change the data's shape into anything he likes.
Regards.
[1] read non CS people trying to switch to numpy/scipy
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion