Elliot Hallmark
2015-06-19 21:19:56 UTC
Debian Sid, 64-bit. I was trying to fix the problem of np.dot running very
slow.
I ended up uninstalling numpy, installing libatlas3-base through apt-get
and re-installing numpy. The performance of dot is greatly improved! But
I can't tell from any other method whether numpy is set up correctly.
Consider comparing the faster one to another in a virtual env that is still
slow:
###
fast one
###
In [1]: import time, numpy
In [2]: n=1000
In [3]: A = numpy.random.rand(n,n)
In [4]: B = numpy.random.rand(n,n)
In [5]: then = time.time(); C=numpy.dot(A,B); print time.time()-then
0.306427001953
In [6]: numpy.show_config()
blas_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
lapack_info:
libraries = ['lapack']
library_dirs = ['/usr/lib']
language = f77
atlas_threads_info:
NOT AVAILABLE
blas_opt_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_blas_threads_info:
NOT AVAILABLE
openblas_info:
NOT AVAILABLE
lapack_opt_info:
libraries = ['lapack', 'blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_info:
NOT AVAILABLE
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
atlas_blas_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
###
slow one
###
In [1]: import time, numpy
In [2]: n=1000
In [3]: A = numpy.random.rand(n,n)
In [4]: B = numpy.random.rand(n,n)
In [5]: then = time.time(); C=numpy.dot(A,B); print time.time()-then
7.88430500031
In [6]: numpy.show_config()
blas_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
lapack_info:
libraries = ['lapack']
library_dirs = ['/usr/lib']
language = f77
atlas_threads_info:
NOT AVAILABLE
blas_opt_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_blas_threads_info:
NOT AVAILABLE
openblas_info:
NOT AVAILABLE
lapack_opt_info:
libraries = ['lapack', 'blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_info:
NOT AVAILABLE
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
atlas_blas_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
#####
Further, in the following comparison between Cpython and converting to
numpy array for one operation, I get Cpython being faster by the same
amount in both environments. But another user got numpy being faster.
In [1]: import numpy as np
In [2]: pts = range(100,1000)
In [3]: pts[100] = 0
In [4]: %timeit pts_arr = np.array(pts); mini = np.argmin(pts_arr)
10000 loops, best of 3: 129 µs per loop
In [5]: %timeit mini = sorted(enumerate(pts))[0][1]
10000 loops, best of 3: 89.2 µs per loop
The other user got
In [29]: %timeit pts_arr = np.array(pts); mini = np.argmin(pts_arr)
10000 loops, best of 3: 37.7 µs per loop
In [30]: %timeit mini = sorted(enumerate(pts))[0][1]
10000 loops, best of 3: 69.2 µs per loop
And I can't help but wonder if there is further configuration I need
to make numpy faster, or if this is just a difference between out
machines
In the future, should I ignore show_config() and just do this dot product
test?
Any guidance would be appreciated.
Thanks,
Elliot
slow.
I ended up uninstalling numpy, installing libatlas3-base through apt-get
and re-installing numpy. The performance of dot is greatly improved! But
I can't tell from any other method whether numpy is set up correctly.
Consider comparing the faster one to another in a virtual env that is still
slow:
###
fast one
###
In [1]: import time, numpy
In [2]: n=1000
In [3]: A = numpy.random.rand(n,n)
In [4]: B = numpy.random.rand(n,n)
In [5]: then = time.time(); C=numpy.dot(A,B); print time.time()-then
0.306427001953
In [6]: numpy.show_config()
blas_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
lapack_info:
libraries = ['lapack']
library_dirs = ['/usr/lib']
language = f77
atlas_threads_info:
NOT AVAILABLE
blas_opt_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_blas_threads_info:
NOT AVAILABLE
openblas_info:
NOT AVAILABLE
lapack_opt_info:
libraries = ['lapack', 'blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_info:
NOT AVAILABLE
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
atlas_blas_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
###
slow one
###
In [1]: import time, numpy
In [2]: n=1000
In [3]: A = numpy.random.rand(n,n)
In [4]: B = numpy.random.rand(n,n)
In [5]: then = time.time(); C=numpy.dot(A,B); print time.time()-then
7.88430500031
In [6]: numpy.show_config()
blas_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
lapack_info:
libraries = ['lapack']
library_dirs = ['/usr/lib']
language = f77
atlas_threads_info:
NOT AVAILABLE
blas_opt_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_blas_threads_info:
NOT AVAILABLE
openblas_info:
NOT AVAILABLE
lapack_opt_info:
libraries = ['lapack', 'blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_info:
NOT AVAILABLE
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
atlas_blas_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
#####
Further, in the following comparison between Cpython and converting to
numpy array for one operation, I get Cpython being faster by the same
amount in both environments. But another user got numpy being faster.
In [1]: import numpy as np
In [2]: pts = range(100,1000)
In [3]: pts[100] = 0
In [4]: %timeit pts_arr = np.array(pts); mini = np.argmin(pts_arr)
10000 loops, best of 3: 129 µs per loop
In [5]: %timeit mini = sorted(enumerate(pts))[0][1]
10000 loops, best of 3: 89.2 µs per loop
The other user got
In [29]: %timeit pts_arr = np.array(pts); mini = np.argmin(pts_arr)
10000 loops, best of 3: 37.7 µs per loop
In [30]: %timeit mini = sorted(enumerate(pts))[0][1]
10000 loops, best of 3: 69.2 µs per loop
And I can't help but wonder if there is further configuration I need
to make numpy faster, or if this is just a difference between out
machines
In the future, should I ignore show_config() and just do this dot product
test?
Any guidance would be appreciated.
Thanks,
Elliot