Mathieu Dubois
2015-12-09 14:51:55 UTC
Dear all,
If I am correct, using mmap_mode with Npz files has no effect i.e.:
f = np.load("data.npz", mmap_mode="r")
X = f['X']
will load all the data in memory.
Can somebody confirm that?
If I'm correct, the mmap_mode argument could be passed to the NpzFile
class which could in turn perform the correct operation. One way to
handle that would be to use the ZipFile.extract method to write the Npy
file on disk and then load it with numpy.load with the mmap_mode
argument. Note that the user will have to remove the file to reclaim
disk space (I guess that's OK).
One problem that could arise is that the extracted Npy file can be large
(it's the purpose of using memory mapping) and therefore it may be
useful to offer some control on where this file is extracted (for
instance /tmp can be too small to extract the file here). numpy.load
could offer a new option for that (passed to ZipFile.extract).
Does it make sense?
Thanks in advance,
Mathieu
If I am correct, using mmap_mode with Npz files has no effect i.e.:
f = np.load("data.npz", mmap_mode="r")
X = f['X']
will load all the data in memory.
Can somebody confirm that?
If I'm correct, the mmap_mode argument could be passed to the NpzFile
class which could in turn perform the correct operation. One way to
handle that would be to use the ZipFile.extract method to write the Npy
file on disk and then load it with numpy.load with the mmap_mode
argument. Note that the user will have to remove the file to reclaim
disk space (I guess that's OK).
One problem that could arise is that the extracted Npy file can be large
(it's the purpose of using memory mapping) and therefore it may be
useful to offer some control on where this file is extracted (for
instance /tmp can be too small to extract the file here). numpy.load
could offer a new option for that (passed to ZipFile.extract).
Does it make sense?
Thanks in advance,
Mathieu