Discussion:
[Numpy-discussion] NumPy 1.12.0 release
Charles R Harris
2017-01-15 23:43:41 UTC
Permalink
Hi All,

I'm pleased to announce the NumPy 1.12.0 release. This release supports
Python 2.7 and 3.4-3.6. Wheels for all supported Python versions may be
downloaded from PiPY
<https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=numpy>, the tarball
and zip files may be downloaded from Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0>. The release notes
and files hashes may also be found at Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0> .

NumPy 1.12.0rc 2 is the result of 418 pull requests submitted by 139
contributors and comprises a large number of fixes and improvements. Among
the many improvements it is difficult to pick out just a few as standing
above the others, but the following may be of particular interest or
indicate areas likely to have future consequences.

* Order of operations in ``np.einsum`` can now be optimized for large speed
improvements.
* New ``signature`` argument to ``np.vectorize`` for vectorizing with core
dimensions.
* The ``keepdims`` argument was added to many functions.
* New context manager for testing warnings
* Support for BLIS in numpy.distutils
* Much improved support for PyPy (not yet finished)

Enjoy,

Chuck
Ralf Gommers
2017-01-16 09:42:06 UTC
Permalink
On Mon, Jan 16, 2017 at 12:43 PM, Charles R Harris <
Post by Charles R Harris
Hi All,
I'm pleased to announce the NumPy 1.12.0 release. This release supports
Python 2.7 and 3.4-3.6. Wheels for all supported Python versions may be
downloaded from PiPY
<https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=numpy>, the tarball
and zip files may be downloaded from Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0>. The release notes
and files hashes may also be found at Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0> .
NumPy 1.12.0rc 2 is the result of 418 pull requests submitted by 139
contributors and comprises a large number of fixes and improvements. Among
the many improvements it is difficult to pick out just a few as standing
above the others, but the following may be of particular interest or
indicate areas likely to have future consequences.
* Order of operations in ``np.einsum`` can now be optimized for large
speed improvements.
* New ``signature`` argument to ``np.vectorize`` for vectorizing with core
dimensions.
* The ``keepdims`` argument was added to many functions.
* New context manager for testing warnings
* Support for BLIS in numpy.distutils
* Much improved support for PyPy (not yet finished)
Thanks for all the heavy lifting on this one Chuck!

Ralf
Neal Becker
2017-01-17 13:56:42 UTC
Permalink
Post by Charles R Harris
Hi All,
I'm pleased to announce the NumPy 1.12.0 release. This release supports
Python 2.7 and 3.4-3.6. Wheels for all supported Python versions may be
downloaded from PiPY
<https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=numpy>, the tarball
and zip files may be downloaded from Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0>. The release notes
and files hashes may also be found at Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0> .
NumPy 1.12.0rc 2 is the result of 418 pull requests submitted by 139
contributors and comprises a large number of fixes and improvements. Among
the many improvements it is difficult to pick out just a few as standing
above the others, but the following may be of particular interest or
indicate areas likely to have future consequences.
* Order of operations in ``np.einsum`` can now be optimized for large
speed improvements.
* New ``signature`` argument to ``np.vectorize`` for vectorizing with core
dimensions.
* The ``keepdims`` argument was added to many functions.
* New context manager for testing warnings
* Support for BLIS in numpy.distutils
* Much improved support for PyPy (not yet finished)
Enjoy,
Chuck
I've installed via pip3 on linux x86_64, which gives me a wheel. My
question is, am I loosing significant performance choosing this pre-built
binary vs. compiling myself? For example, my processor might have some more
features than the base version used to build wheels.
Matthew Brett
2017-01-17 18:02:42 UTC
Permalink
Hi,
Post by Neal Becker
Post by Charles R Harris
Hi All,
I'm pleased to announce the NumPy 1.12.0 release. This release supports
Python 2.7 and 3.4-3.6. Wheels for all supported Python versions may be
downloaded from PiPY
<https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=numpy>, the tarball
and zip files may be downloaded from Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0>. The release notes
and files hashes may also be found at Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0> .
NumPy 1.12.0rc 2 is the result of 418 pull requests submitted by 139
contributors and comprises a large number of fixes and improvements. Among
the many improvements it is difficult to pick out just a few as standing
above the others, but the following may be of particular interest or
indicate areas likely to have future consequences.
* Order of operations in ``np.einsum`` can now be optimized for large
speed improvements.
* New ``signature`` argument to ``np.vectorize`` for vectorizing with core
dimensions.
* The ``keepdims`` argument was added to many functions.
* New context manager for testing warnings
* Support for BLIS in numpy.distutils
* Much improved support for PyPy (not yet finished)
Enjoy,
Chuck
I've installed via pip3 on linux x86_64, which gives me a wheel. My
question is, am I loosing significant performance choosing this pre-built
binary vs. compiling myself? For example, my processor might have some more
features than the base version used to build wheels.
I guess you are thinking about using this built wheel on some other
machine? You'd have to be lucky for that to work; the wheel depends
on the symbols it found at build time, which may not exist in the same
places on your other machine.

If it does work, the speed will primarily depend on your BLAS library.

The pypi wheels should be pretty fast; they are built with OpenBLAS,
which is at or near top of range for speed, across a range of
platforms.

Cheers,

Matthew
Matthew Brett
2017-01-18 00:14:14 UTC
Permalink
Post by Matthew Brett
Hi,
Post by Neal Becker
Post by Charles R Harris
Hi All,
I'm pleased to announce the NumPy 1.12.0 release. This release supports
Python 2.7 and 3.4-3.6. Wheels for all supported Python versions may be
downloaded from PiPY
<https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=numpy>, the
tarball and zip files may be downloaded from Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0>. The release notes
and files hashes may also be found at Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0> .
NumPy 1.12.0rc 2 is the result of 418 pull requests submitted by 139
contributors and comprises a large number of fixes and improvements. Among
the many improvements it is difficult to pick out just a few as
standing above the others, but the following may be of particular
interest or indicate areas likely to have future consequences.
* Order of operations in ``np.einsum`` can now be optimized for large
speed improvements.
* New ``signature`` argument to ``np.vectorize`` for vectorizing with
core dimensions.
* The ``keepdims`` argument was added to many functions.
* New context manager for testing warnings
* Support for BLIS in numpy.distutils
* Much improved support for PyPy (not yet finished)
Enjoy,
Chuck
I've installed via pip3 on linux x86_64, which gives me a wheel. My
question is, am I loosing significant performance choosing this pre-built
binary vs. compiling myself? For example, my processor might have some
more features than the base version used to build wheels.
I guess you are thinking about using this built wheel on some other
machine? You'd have to be lucky for that to work; the wheel depends
on the symbols it found at build time, which may not exist in the same
places on your other machine.
If it does work, the speed will primarily depend on your BLAS library.
The pypi wheels should be pretty fast; they are built with OpenBLAS,
which is at or near top of range for speed, across a range of
platforms.
Cheers,
Matthew
I installed using pip3 install, and it installed a wheel package. I did not
build it - aren't wheels already compiled packages? So isn't it built for
the common denominator architecture, not necessarily as fast as one I built
myself on my own machine? My question is, on x86_64, is this potential
difference large enough to bother with not using precompiled wheel packages?
Ah - my guess is that you'd be hard pressed to make a numpy that is as
fast as the precompiled wheel. The OpenBLAS library included in
numpy selects the routines for your CPU at run-time, so they will
generally be fast on your CPU. You might be able to get equivalent
or even better performance with a ATLAS BLAS library recompiled on
your exact machine, but that's quite a serious investment of time to
get working, and you'd have to benchmark to find if you were really
doing any better.

Cheers,

Matthew
Neal Becker
2017-01-18 12:02:01 UTC
Permalink
Post by Matthew Brett
Post by Matthew Brett
Hi,
Post by Neal Becker
Post by Charles R Harris
Hi All,
I'm pleased to announce the NumPy 1.12.0 release. This release
supports Python 2.7 and 3.4-3.6. Wheels for all supported Python
versions may be downloaded from PiPY
<https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=numpy>, the
tarball and zip files may be downloaded from Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0>. The release
notes and files hashes may also be found at Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0> .
NumPy 1.12.0rc 2 is the result of 418 pull requests submitted by 139
contributors and comprises a large number of fixes and improvements. Among
the many improvements it is difficult to pick out just a few as
standing above the others, but the following may be of particular
interest or indicate areas likely to have future consequences.
* Order of operations in ``np.einsum`` can now be optimized for large
speed improvements.
* New ``signature`` argument to ``np.vectorize`` for vectorizing with
core dimensions.
* The ``keepdims`` argument was added to many functions.
* New context manager for testing warnings
* Support for BLIS in numpy.distutils
* Much improved support for PyPy (not yet finished)
Enjoy,
Chuck
I've installed via pip3 on linux x86_64, which gives me a wheel. My
question is, am I loosing significant performance choosing this pre-built
binary vs. compiling myself? For example, my processor might have some
more features than the base version used to build wheels.
I guess you are thinking about using this built wheel on some other
machine? You'd have to be lucky for that to work; the wheel depends
on the symbols it found at build time, which may not exist in the same
places on your other machine.
If it does work, the speed will primarily depend on your BLAS library.
The pypi wheels should be pretty fast; they are built with OpenBLAS,
which is at or near top of range for speed, across a range of
platforms.
Cheers,
Matthew
I installed using pip3 install, and it installed a wheel package. I did not
build it - aren't wheels already compiled packages? So isn't it built
for the common denominator architecture, not necessarily as fast as one I
built
myself on my own machine? My question is, on x86_64, is this potential
difference large enough to bother with not using precompiled wheel packages?
Ah - my guess is that you'd be hard pressed to make a numpy that is as
fast as the precompiled wheel. The OpenBLAS library included in
numpy selects the routines for your CPU at run-time, so they will
generally be fast on your CPU. You might be able to get equivalent
or even better performance with a ATLAS BLAS library recompiled on
your exact machine, but that's quite a serious investment of time to
get working, and you'd have to benchmark to find if you were really
doing any better.
Cheers,
Matthew
OK, so at least for BLAS things should be pretty well optimized.
Nathaniel Smith
2017-01-18 00:20:12 UTC
Permalink
Post by Matthew Brett
Hi,
Post by Neal Becker
Post by Charles R Harris
Hi All,
I'm pleased to announce the NumPy 1.12.0 release. This release supports
Python 2.7 and 3.4-3.6. Wheels for all supported Python versions may be
downloaded from PiPY
<https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=numpy>, the
tarball and zip files may be downloaded from Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0>. The release notes
and files hashes may also be found at Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0> .
NumPy 1.12.0rc 2 is the result of 418 pull requests submitted by 139
contributors and comprises a large number of fixes and improvements. Among
the many improvements it is difficult to pick out just a few as
standing above the others, but the following may be of particular
interest or indicate areas likely to have future consequences.
* Order of operations in ``np.einsum`` can now be optimized for large
speed improvements.
* New ``signature`` argument to ``np.vectorize`` for vectorizing with
core dimensions.
* The ``keepdims`` argument was added to many functions.
* New context manager for testing warnings
* Support for BLIS in numpy.distutils
* Much improved support for PyPy (not yet finished)
Enjoy,
Chuck
I've installed via pip3 on linux x86_64, which gives me a wheel. My
question is, am I loosing significant performance choosing this pre-built
binary vs. compiling myself? For example, my processor might have some
more features than the base version used to build wheels.
I guess you are thinking about using this built wheel on some other
machine? You'd have to be lucky for that to work; the wheel depends
on the symbols it found at build time, which may not exist in the same
places on your other machine.
If it does work, the speed will primarily depend on your BLAS library.
The pypi wheels should be pretty fast; they are built with OpenBLAS,
which is at or near top of range for speed, across a range of
platforms.
Cheers,
Matthew
I installed using pip3 install, and it installed a wheel package. I did not
build it - aren't wheels already compiled packages? So isn't it built for
the common denominator architecture, not necessarily as fast as one I built
myself on my own machine? My question is, on x86_64, is this potential
difference large enough to bother with not using precompiled wheel packages?
Ultimately, it's going to depend on all sorts of things, including
most importantly your actual code. Like most speed questions, the only
real way to know is to try it and measure the difference.

The wheels do ship with a fast BLAS (OpenBLAS configured to
automatically adapt to your CPU at runtime), so the performance will
at least be reasonable. Possible improvements would include using a
different and somehow better BLAS (MKL might be faster in some cases),
tweaking your compiler options to take advantage of whatever SIMD ISAs
your particular CPU supports (numpy's build system doesn't do this
automatically but in principle you could do it by hand -- were you
bothering before? does it even make a difference in practice? I
dunno), and using a new compiler (the linux wheels use a somewhat
ancient version of gcc for Reasons; newer compilers are better at
optimizing -- how much does it matter? again I dunno).

Basically: if you want to experiment and report back then I think we'd
all be interested to hear; OTOH if you aren't feeling particularly
curious/ambitious then I wouldn't worry about it :-).

-n
--
Nathaniel J. Smith -- https://vorpus.org
Neal Becker
2017-01-18 12:00:18 UTC
Permalink
Post by Nathaniel Smith
Post by Matthew Brett
Hi,
Post by Neal Becker
Post by Charles R Harris
Hi All,
I'm pleased to announce the NumPy 1.12.0 release. This release
supports Python 2.7 and 3.4-3.6. Wheels for all supported Python
versions may be downloaded from PiPY
<https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=numpy>, the
tarball and zip files may be downloaded from Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0>. The release
notes and files hashes may also be found at Github
<https://github.com/numpy/numpy/releases/tag/v1.12.0> .
NumPy 1.12.0rc 2 is the result of 418 pull requests submitted by 139
contributors and comprises a large number of fixes and improvements. Among
the many improvements it is difficult to pick out just a few as
standing above the others, but the following may be of particular
interest or indicate areas likely to have future consequences.
* Order of operations in ``np.einsum`` can now be optimized for large
speed improvements.
* New ``signature`` argument to ``np.vectorize`` for vectorizing with
core dimensions.
* The ``keepdims`` argument was added to many functions.
* New context manager for testing warnings
* Support for BLIS in numpy.distutils
* Much improved support for PyPy (not yet finished)
Enjoy,
Chuck
I've installed via pip3 on linux x86_64, which gives me a wheel. My
question is, am I loosing significant performance choosing this pre-built
binary vs. compiling myself? For example, my processor might have some
more features than the base version used to build wheels.
I guess you are thinking about using this built wheel on some other
machine? You'd have to be lucky for that to work; the wheel depends
on the symbols it found at build time, which may not exist in the same
places on your other machine.
If it does work, the speed will primarily depend on your BLAS library.
The pypi wheels should be pretty fast; they are built with OpenBLAS,
which is at or near top of range for speed, across a range of
platforms.
Cheers,
Matthew
I installed using pip3 install, and it installed a wheel package. I did not
build it - aren't wheels already compiled packages? So isn't it built
for the common denominator architecture, not necessarily as fast as one I
built
myself on my own machine? My question is, on x86_64, is this potential
difference large enough to bother with not using precompiled wheel packages?
Ultimately, it's going to depend on all sorts of things, including
most importantly your actual code. Like most speed questions, the only
real way to know is to try it and measure the difference.
The wheels do ship with a fast BLAS (OpenBLAS configured to
automatically adapt to your CPU at runtime), so the performance will
at least be reasonable. Possible improvements would include using a
different and somehow better BLAS (MKL might be faster in some cases),
tweaking your compiler options to take advantage of whatever SIMD ISAs
your particular CPU supports (numpy's build system doesn't do this
automatically but in principle you could do it by hand -- were you
bothering before? does it even make a difference in practice? I
dunno), and using a new compiler (the linux wheels use a somewhat
ancient version of gcc for Reasons; newer compilers are better at
optimizing -- how much does it matter? again I dunno).
Basically: if you want to experiment and report back then I think we'd
all be interested to hear; OTOH if you aren't feeling particularly
curious/ambitious then I wouldn't worry about it :-).
-n
Yes, I always add -march=native, which should pickup whatever SIMD is
available. So my question was primarily if I should bother. Thanks for the
detailed answer.
Jerome Kieffer
2017-01-18 07:15:06 UTC
Permalink
On Tue, 17 Jan 2017 08:56:42 -0500
Post by Neal Becker
I've installed via pip3 on linux x86_64, which gives me a wheel. My
question is, am I loosing significant performance choosing this pre-built
binary vs. compiling myself? For example, my processor might have some more
features than the base version used to build wheels.
Hi,

I have done some benchmarking (%timeit) for my code running in a
jupyter-notebook within a venv installed with pip+manylinux wheels
versus ipython and debian packages (on the same computer).
I noticed the debian installation was ~20% faster.

I did not investigate further if those 20% came from the manylinux (I
suspect) or from the notebook infrastructure.

HTH,
--
Jérôme Kieffer
Nathan Goldbaum
2017-01-18 07:27:28 UTC
Permalink
I've seen reports on the anaconda mailing list of people seeing similar
speed ups when they compile e.g. Numpy with a recent gcc. Anaconda has the
same issue as manylinux in that they need to use versions of GCC available
on CentOS 5.

Given the upcoming official EOL for CentOS5, it might make sense to think
about making a pep for a CentOS 6-based manylinux2 docker image, which will
allow compiling with a newer GCC.
Post by Jerome Kieffer
On Tue, 17 Jan 2017 08:56:42 -0500
Post by Neal Becker
I've installed via pip3 on linux x86_64, which gives me a wheel. My
question is, am I loosing significant performance choosing this pre-built
binary vs. compiling myself? For example, my processor might have some
more
Post by Neal Becker
features than the base version used to build wheels.
Hi,
I have done some benchmarking (%timeit) for my code running in a
jupyter-notebook within a venv installed with pip+manylinux wheels
versus ipython and debian packages (on the same computer).
I noticed the debian installation was ~20% faster.
I did not investigate further if those 20% came from the manylinux (I
suspect) or from the notebook infrastructure.
HTH,
--
JérÎme Kieffer
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Julian Taylor
2017-01-18 11:43:25 UTC
Permalink
The version of gcc used will make a large difference in some places.
E.g. the AVX2 integer ufuncs require something around 4.5 to work and in
general the optimization level of gcc has improved greatly since the
clang competition showed up around that time. centos 5 has 4.1 which is
really ancient.
I though the wheels used newer gccs also on centos 5?
Post by Nathan Goldbaum
I've seen reports on the anaconda mailing list of people seeing similar
speed ups when they compile e.g. Numpy with a recent gcc. Anaconda has
the same issue as manylinux in that they need to use versions of GCC
available on CentOS 5.
Given the upcoming official EOL for CentOS5, it might make sense to
think about making a pep for a CentOS 6-based manylinux2 docker image,
which will allow compiling with a newer GCC.
On Tue, 17 Jan 2017 08:56:42 -0500
Post by Neal Becker
I've installed via pip3 on linux x86_64, which gives me a wheel. My
question is, am I loosing significant performance choosing this
pre-built
Post by Neal Becker
binary vs. compiling myself? For example, my processor might have
some more
Post by Neal Becker
features than the base version used to build wheels.
Hi,
I have done some benchmarking (%timeit) for my code running in a
jupyter-notebook within a venv installed with pip+manylinux wheels
versus ipython and debian packages (on the same computer).
I noticed the debian installation was ~20% faster.
I did not investigate further if those 20% came from the manylinux (I
suspect) or from the notebook infrastructure.
HTH,
--
Jérôme Kieffer
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
David Cournapeau
2017-01-18 12:15:16 UTC
Permalink
On Wed, Jan 18, 2017 at 11:43 AM, Julian Taylor <
Post by Julian Taylor
The version of gcc used will make a large difference in some places.
E.g. the AVX2 integer ufuncs require something around 4.5 to work and in
general the optimization level of gcc has improved greatly since the
clang competition showed up around that time. centos 5 has 4.1 which is
really ancient.
I though the wheels used newer gccs also on centos 5?
I don't know if it is mandatory for many wheels, but it is possilbe to
build w/ gcc 4.8 at least, and still binary compatibility with centos 5.X
and above, though I am not sure about the impact on speed.

It has been quite some time already that building numpy/scipy with gcc 4.1
causes troubles with errors and even crashes anyway, so you definitely want
to use a more recent compiler in any case.

David
Post by Julian Taylor
Post by Nathan Goldbaum
I've seen reports on the anaconda mailing list of people seeing similar
speed ups when they compile e.g. Numpy with a recent gcc. Anaconda has
the same issue as manylinux in that they need to use versions of GCC
available on CentOS 5.
Given the upcoming official EOL for CentOS5, it might make sense to
think about making a pep for a CentOS 6-based manylinux2 docker image,
which will allow compiling with a newer GCC.
On Tue, 17 Jan 2017 08:56:42 -0500
Post by Neal Becker
I've installed via pip3 on linux x86_64, which gives me a wheel.
My
Post by Nathan Goldbaum
Post by Neal Becker
question is, am I loosing significant performance choosing this
pre-built
Post by Neal Becker
binary vs. compiling myself? For example, my processor might have
some more
Post by Neal Becker
features than the base version used to build wheels.
Hi,
I have done some benchmarking (%timeit) for my code running in a
jupyter-notebook within a venv installed with pip+manylinux wheels
versus ipython and debian packages (on the same computer).
I noticed the debian installation was ~20% faster.
I did not investigate further if those 20% came from the manylinux (I
suspect) or from the notebook infrastructure.
HTH,
--
JérÎme Kieffer
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Nathaniel Smith
2017-01-18 12:59:18 UTC
Permalink
On Wed, Jan 18, 2017 at 3:43 AM, Julian Taylor
Post by Julian Taylor
The version of gcc used will make a large difference in some places.
E.g. the AVX2 integer ufuncs require something around 4.5 to work and in
general the optimization level of gcc has improved greatly since the
clang competition showed up around that time. centos 5 has 4.1 which is
really ancient.
I though the wheels used newer gccs also on centos 5?
The wheels are built with gcc 4.8, which is the last version that you
can get to build for centos 5.

When we bump to centos 6 as the minimum supported, we'll be able to
switch to gcc 5.3.1.

-n
--
Nathaniel J. Smith -- https://vorpus.org
Continue reading on narkive:
Loading...