Michael Ward
2016-06-28 20:36:26 UTC
Heya, I'm not a numbers guy, but I maintain servers for scientists and
researchers who are. Someone pointed out that our numpy installation on
a particular server was only using one core. I'm unaware of the who/how
the previous version of numpy/OpenBLAS were installed, so I installed
them from scratch, and confirmed that the users test code now runs on
multiple cores as expected, drastically increasing performance time.
Now the user is writing back to say, "my test code is fast now, but
numpy.test() is still about three times slower than <some other server
we don't manage>". When I watch htop as numpy.test() executes, sure
enough, it's using one core. Now I'm not sure if that's the expected
behavior or not. Questions:
* if numpy.test() is supposed to be using multiple cores, why isn't it,
when we've established with other test code that it's now using multiple
cores?
* if numpy.test() is not supposed to be using multiple cores, what could
be the reason that the performance is drastically slower than another
server with a comparable CPU, when the user's test code performs
comparably?
For what it's worth, the users "test" code which does run on multiple
cores is as simple as:
size=4000
a = np.random.random_sample((size,size))
b = np.random.random_sample((size,size))
x = np.dot(a,b)
Whereas this uses only one core:
numpy.test()
---------------------------
OpenBLAS 0.2.18 was basically just compiled with "make", nothing special
to it. Numpy 1.11.0 was installed from source (python setup.py
install), using a site.cfg file to point numpy to the new OpenBLAS.
Thanks,
Mike
researchers who are. Someone pointed out that our numpy installation on
a particular server was only using one core. I'm unaware of the who/how
the previous version of numpy/OpenBLAS were installed, so I installed
them from scratch, and confirmed that the users test code now runs on
multiple cores as expected, drastically increasing performance time.
Now the user is writing back to say, "my test code is fast now, but
numpy.test() is still about three times slower than <some other server
we don't manage>". When I watch htop as numpy.test() executes, sure
enough, it's using one core. Now I'm not sure if that's the expected
behavior or not. Questions:
* if numpy.test() is supposed to be using multiple cores, why isn't it,
when we've established with other test code that it's now using multiple
cores?
* if numpy.test() is not supposed to be using multiple cores, what could
be the reason that the performance is drastically slower than another
server with a comparable CPU, when the user's test code performs
comparably?
For what it's worth, the users "test" code which does run on multiple
cores is as simple as:
size=4000
a = np.random.random_sample((size,size))
b = np.random.random_sample((size,size))
x = np.dot(a,b)
Whereas this uses only one core:
numpy.test()
---------------------------
OpenBLAS 0.2.18 was basically just compiled with "make", nothing special
to it. Numpy 1.11.0 was installed from source (python setup.py
install), using a site.cfg file to point numpy to the new OpenBLAS.
Thanks,
Mike