Nadav Horesh
2015-10-14 05:23:48 UTC
I have binary files of size range between few MB to 1GB, which I read process as memory mapped files (via np.memmap). Until numpy 1.9 the creation of recarray on an existing file (without reading its content) was instantaneous, and now it takes ~6 seconds (system: archlinux on sandy bridge). A profiling (using ipython %prun) top of the list is:
ncalls tottime percall cumtime percall filename:lineno(function)
21 3.037 0.145 4.266 0.203 _internal.py:372(_check_field_overlap)
3713431 1.663 0.000 1.663 0.000 _internal.py:366(<genexpr>)
3713750 0.790 0.000 0.790 0.000 {range}
3713709 0.406 0.000 0.406 0.000 {method 'update' of 'set' objects}
322 0.320 0.001 1.984 0.006 {method 'extend' of 'list' objects}
Nadav.
ncalls tottime percall cumtime percall filename:lineno(function)
21 3.037 0.145 4.266 0.203 _internal.py:372(_check_field_overlap)
3713431 1.663 0.000 1.663 0.000 _internal.py:366(<genexpr>)
3713750 0.790 0.000 0.790 0.000 {range}
3713709 0.406 0.000 0.406 0.000 {method 'update' of 'set' objects}
322 0.320 0.001 1.984 0.006 {method 'extend' of 'list' objects}
Nadav.