To elaborate on that point; knowing that numpy accumulates in a simple
first-to-last sweep, and does not implicitly upcast, the original problem
can be solved in several ways; specifying a higher precision to sum with,
or by a nested summation, like X.mean(0).mean(0)==1.0. I personally like
this explicitness, and am wary of numpy doing overly clever things behind
the scenes, as I can think of other code that might become broken if things
change too radically. For instance, I often sort large arrays with a large
spread in magnitudes before summation, relying on the fact that summing the
smallest values first gives best precision. Any changes made to reduction
behavior should try and be backwards compatible with such properties of
straightforward reductions, or else a lot of code is going to be broken
I suppose using maximum precision internally, and nesting all reductions
over multiple axes of an ndarray, are both easy to implement improvements
that do not come with any drawbacks that I can think of. Actually the
maximum precision I am not so sure of, as I personally prefer to make an
informed decision about precision used, and get an error on a platform that
does not support the specified precision, rather than obtain subtly or
horribly broken results without warning when moving your code to a
different platform/compiler whatever.
On Fri, Jul 25, 2014 at 5:37 AM, Eelco Hoogendoorn <
Post by Eelco Hoogendoorn
Perhaps it is a slightly semantical discussion; but all fp calculations
have errors, and there are always strategies for making them smaller. We
just don't happen to like the error for this case; but rest assured it
won't be hard to find new cases of 'blatantly wrong' results, no matter
what accumulator is implemented. That's no reason to not try and be clever
about it, but there isn't going to be an algorithm that is best for all
possible inputs, and in the end the most important thing is that the
algorithm used is specified in the docs.
Sent: â25-â7-â2014 00:10
Subject: Re: [Numpy-discussion] numpy.mean still broken for
This isn't a bug report, but rather a feature request.
I'm not sure statement this is correct. The mean of a float32 array
can certainly be computed as a float32. Currently this is
not necessarily what happens, not even approximately.
That feels a lot like a bug, even if we can readily understand
how the algorithm currently used produces it. To say whether
it is a bug or not, don't we have to ask about the intent of `mean`?
If the intent is to sum and divide, then it is not a bug.
If the intent is to produce the mean, then it is a bug.
NumPy-Discussion mailing list