Discussion:
[Numpy-discussion] Default type for functions that accumulate integers
Charles R Harris
2017-01-03 02:27:09 UTC
Permalink
Hi All,

Currently functions like trace use the C long type as the default
accumulator for integer types of lesser precision:

dtype : dtype, optional
Determines the data-type of the returned array and of the accumulator
where the elements are summed. If dtype has the value None and `a` is
of integer type of precision less than the default integer
precision, then the default integer precision is used. Otherwise,
the precision is the same as that of `a`.
The problem with this is that the precision of long varies with the
platform so that the result varies, see gh-8433
<https://github.com/numpy/numpy/issues/8433> for a complaint about this.
There are two possible alternatives that seem reasonable to me:


1. Use 32 bit accumulators on 32 bit platforms and 64 bit accumulators
on 64 bit platforms.
2. Always use 64 bit accumulators.

Thoughts?

Chuck
Nathaniel Smith
2017-01-03 02:46:08 UTC
Permalink
On Mon, Jan 2, 2017 at 6:27 PM, Charles R Harris
Post by Charles R Harris
Hi All,
Currently functions like trace use the C long type as the default
Post by Charles R Harris
dtype : dtype, optional
Determines the data-type of the returned array and of the accumulator
where the elements are summed. If dtype has the value None and `a` is
of integer type of precision less than the default integer
precision, then the default integer precision is used. Otherwise,
the precision is the same as that of `a`.
The problem with this is that the precision of long varies with the platform
so that the result varies, see gh-8433 for a complaint about this. There
Use 32 bit accumulators on 32 bit platforms and 64 bit accumulators on 64
bit platforms.
Always use 64 bit accumulators.
This is a special case of a more general question: right now we use
the default integer precision (i.e., what you get from np.array([1]),
or np.arange, or np.dtype(int)), and it turns out that the default
integer precision itself varies in confusing ways, and this is a
common source of bugs. Specifically: right now it's 32-bit on 32-bit
builds, and 64-bit on 64-bit builds, except on Windows where it's
always 32-bit. This matches the default precision of Python 2 'int'.

So some options include:
- make the default integer precision 64-bits everywhere
- make the default integer precision 32-bits on 32-bit systems, and
64-bits on 64-bit systems (including Windows)
- leave the default integer precision the same, but make accumulators
64-bits everywhere
- leave the default integer precision the same, but make accumulators
64-bits on 64-bit systems (including Windows)
- ...

Given the prevalence of 64-bit systems these days, and the fact that
the current setup makes it very easy to write code that seems to work
when tested on a 64-bit system but that silently returns incorrect
results on 32-bit systems, it sure would be nice if we could switch to
a 64-bit default everywhere. (You could still get 32-bit integers, of
course, you'd just have to ask for them explicitly.)

Things we'd need to know more about before making a decision:
- compatibility: if we flip this switch, how much code breaks? In
general correct numpy-using code has to be prepared to handle
np.dtype(int) being 64-bits, and in fact there might be more code that
accidentally assumes that np.dtype(int) is always 64-bits than there
is code that assumes it is always 32-bits. But that's theory; to know
how bad this is we would need to try actually running some projects
test suites and see whether they break or not.
- speed: there's probably some cost to using 64-bit integers on 32-bit
systems; how big is the penalty in practice?

-n
--
Nathaniel J. Smith -- https://vorpus.org
Sebastian Berg
2017-01-03 17:08:41 UTC
Permalink
Post by Nathaniel Smith
On Mon, Jan 2, 2017 at 6:27 PM, Charles R Harris
Post by Charles R Harris
Hi All,
Currently functions like trace use the C long type as the default
<snip>
Post by Nathaniel Smith
- compatibility: if we flip this switch, how much code breaks? In
general correct numpy-using code has to be prepared to handle
np.dtype(int) being 64-bits, and in fact there might be more code that
accidentally assumes that np.dtype(int) is always 64-bits than there
is code that assumes it is always 32-bits. But that's theory; to know
how bad this is we would need to try actually running some projects
test suites and see whether they break or not.
- speed: there's probably some cost to using 64-bit integers on 32-
bit
systems; how big is the penalty in practice?
I agree with trying to switch the default in general first, I don't
like the idea of having two different "defaults".

There are two issues, one is the change on Python 2 (no inheritance of
Python int by default numpy type) and any issues due to increased
precision (more RAM usage, code actually expects lower precision
somehow, etc.).
Cannot say I know for sure, but I would be extremely surprised if there
is a speed difference between 32bit vs. 64bit architectures, except the
general slowdown you get due to bus speeds, etc. when going to higher
bit width.

If the inheritance for some reason is a bigger issue, we might limit
the change to Python 3. For other possible problems, I think we may
have difficulties assessing how much is affected. The problem is, that
the most affected thing should be projects only being used on windows,
or so. Bigger projects should work fine already (they are more likely
to get better due to not being tested as well on 32bit long platforms,
especially 64bit windows).

Of course limiting the change to python 3, could have the advantage of
not affecting older projects which are possibly more likely to be
specifically using the current behaviour.

So, I would be open to trying the change, I think the idea of at least
changing it in python 3 has been brought up a couple of times,
including by Julian, so maybe it is time to give it a shot....

It would be interesting to see if anyone knows projects that may be
affected (for example because they are designed to only run on windows
or limited hardware), and if avoiding to change anything in python 2
might mitigate problems here as well (additionally to avoiding the
inheritance change)?

Best,

Sebastian
Post by Nathaniel Smith
-n
Charles R Harris
2017-01-03 18:15:14 UTC
Permalink
Post by Sebastian Berg
Post by Nathaniel Smith
On Mon, Jan 2, 2017 at 6:27 PM, Charles R Harris
Post by Charles R Harris
Hi All,
Currently functions like trace use the C long type as the default
<snip>
Post by Nathaniel Smith
- compatibility: if we flip this switch, how much code breaks? In
general correct numpy-using code has to be prepared to handle
np.dtype(int) being 64-bits, and in fact there might be more code that
accidentally assumes that np.dtype(int) is always 64-bits than there
is code that assumes it is always 32-bits. But that's theory; to know
how bad this is we would need to try actually running some projects
test suites and see whether they break or not.
- speed: there's probably some cost to using 64-bit integers on 32-
bit
systems; how big is the penalty in practice?
I agree with trying to switch the default in general first, I don't
like the idea of having two different "defaults".
There are two issues, one is the change on Python 2 (no inheritance of
Python int by default numpy type) and any issues due to increased
precision (more RAM usage, code actually expects lower precision
somehow, etc.).
Cannot say I know for sure, but I would be extremely surprised if there
is a speed difference between 32bit vs. 64bit architectures, except the
general slowdown you get due to bus speeds, etc. when going to higher
bit width.
If the inheritance for some reason is a bigger issue, we might limit
the change to Python 3. For other possible problems, I think we may
have difficulties assessing how much is affected. The problem is, that
the most affected thing should be projects only being used on windows,
or so. Bigger projects should work fine already (they are more likely
to get better due to not being tested as well on 32bit long platforms,
especially 64bit windows).
Of course limiting the change to python 3, could have the advantage of
not affecting older projects which are possibly more likely to be
specifically using the current behaviour.
So, I would be open to trying the change, I think the idea of at least
changing it in python 3 has been brought up a couple of times,
including by Julian, so maybe it is time to give it a shot....
It would be interesting to see if anyone knows projects that may be
affected (for example because they are designed to only run on windows
or limited hardware), and if avoiding to change anything in python 2
might mitigate problems here as well (additionally to avoiding the
inheritance change)?
There have been a number of reports of problems due to the inheritance
stemming both from the changing precision and, IIRC, from differences in
print format or some such. So I don't expect that there will be no
problems, but they will probably not be difficult to fix.

Chuck
Antoine Pitrou
2017-01-03 19:59:47 UTC
Permalink
On Mon, 2 Jan 2017 18:46:08 -0800
Post by Nathaniel Smith
- make the default integer precision 64-bits everywhere
- make the default integer precision 32-bits on 32-bit systems, and
64-bits on 64-bit systems (including Windows)
Either of those two would be the best IMO.

Intuitively, I think people would expect 32-bit ints in 32-bit
processes by default, and 64-bit ints in 64-bit processes likewise. So
I would slightly favour the latter option.
Post by Nathaniel Smith
- leave the default integer precision the same, but make accumulators
64-bits everywhere
- leave the default integer precision the same, but make accumulators
64-bits on 64-bit systems (including Windows)
Both of these options introduce a confusing discrepancy.
Post by Nathaniel Smith
- speed: there's probably some cost to using 64-bit integers on 32-bit
systems; how big is the penalty in practice?
Ok, I have fired up a Windows VM to compare 32-bit and 64-bit builds.
Numpy version is 1.11.2, Python version is 3.5.2. Keep in mind those
are Anaconda builds of Numpy, with MKL enabled for linear algebra;
YMMV.

For each benchmark, the first number is the result on the 32-bit build,
the second number on the 64-bit build.

Simple arithmetic
-----------------
Post by Nathaniel Smith
v = np.ones(1024**2, dtype='int32')
%timeit v + v # 1.73 ms per loop | 1.78 ms per loop
%timeit v * v # 1.77 ms per loop | 1.79 ms per loop
%timeit v // v # 5.89 ms per loop | 5.39 ms per loop
v = np.ones(1024**2, dtype='int64')
%timeit v + v # 3.54 ms per loop | 3.54 ms per loop
%timeit v * v # 5.61 ms per loop | 3.52 ms per loop
%timeit v // v # 17.1 ms per loop | 13.9 ms per loop
Linear algebra
--------------
Post by Nathaniel Smith
m = np.ones((1024,1024), dtype='int32')
m = np.ones((1024,1024), dtype='int64')
Sorting
-------
Post by Nathaniel Smith
v = np.random.RandomState(42).randint(1000, size=1024**2).astype('int32')
%timeit np.sort(v) # 43.4 ms per loop | 44 ms per loop
v = np.random.RandomState(42).randint(1000, size=1024**2).astype('int64')
%timeit np.sort(v) # 61.5 ms per loop | 45.5 ms per loop
Indexing
--------
Post by Nathaniel Smith
v = np.ones(1024**2, dtype='int32')
%timeit v[v[::-1]] # 2.38 ms per loop | 4.63 ms per loop
v = np.ones(1024**2, dtype='int64')
%timeit v[v[::-1]] # 6.9 ms per loop | 3.63 ms per loop
Quick summary:
- for very simple operations, 32b and 64b builds can have the same perf
on each given bitwidth (though speed is uniformly halved on 64-bit
integers when the given operation is SIMD-vectorized)
- for more sophisticated operations (such as element-wise
multiplication or division, or quicksort, but much more so on the
matrix product), 32b builds are competitive with 64b builds on 32-bit
ints, but lag behind on 64-bit ints
- for indexing, it's desirable to use a "native" width integer,
regardless of whether that means 32- or 64-bit

Of course the numbers will vary depend on the platform (read:
compiler), but some aspects of this comparison will probably translate
to other platforms.

Regards

Antoine.

Loading...