[Numpy-discussion] deprecate fromstring() for text reading?

Discussion:

Chris Barker

2015-10-22 17:03:15 UTC

There was just a question about a bug/issue with scipy.fromstring (which is
numpy.fromstring) when used to read integers from a text file.

https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html

fromstring() is bugging and inflexible for reading text files -- and it is
a very, very ugly mess of code. I dug into it a while back, and gave up --
just to much of a mess!

So we really should completely re-implement it, or deprecate it. I doubt
anyone is going to do a big refactor, so that means deprecating it.

Also -- if we do want a fast read numbers from text files function (which
would be nice, actually), it really should get a new name anyway.

(and the hopefully coming new dtype system would make it easier to write
cleanly)

I'm not sure what deprecating something means, though -- have it raise a
deprecation warning in the next version?

-CHB
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov

Marten van Kerkwijk

2015-10-22 22:35:28 UTC

Permalink

I think it would be good to keep the usage to read binary data at least. Or
is there a good alternative to `np.fromstring(<bytes>, dtype=...)`? --
Marten

Post by Chris Barker
There was just a question about a bug/issue with scipy.fromstring (which
is numpy.fromstring) when used to read integers from a text file.
https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html
fromstring() is bugging and inflexible for reading text files -- and it is
a very, very ugly mess of code. I dug into it a while back, and gave up --
just to much of a mess!
So we really should completely re-implement it, or deprecate it. I doubt
anyone is going to do a big refactor, so that means deprecating it.
Also -- if we do want a fast read numbers from text files function (which
would be nice, actually), it really should get a new name anyway.
(and the hopefully coming new dtype system would make it easier to write
cleanly)
I'm not sure what deprecating something means, though -- have it raise a
deprecation warning in the next version?
-CHB
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Chris Barker - NOAA Federal

2015-10-22 23:47:30 UTC

Permalink

I think it would be good to keep the usage to read binary data at least.

Agreed -- it's only the text file reading I'm proposing to deprecate. It
was kind of weird to cram it in there in the first place.

Oh, fromfile() has the same issues.

Chris

Or is there a good alternative to `np.fromstring(<bytes>, dtype=...)`? --
Marten

Charles R Harris

2015-10-23 22:13:02 UTC

Permalink

On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal <

Post by Marten van Kerkwijk
I think it would be good to keep the usage to read binary data at least.
Agreed -- it's only the text file reading I'm proposing to deprecate. It
was kind of weird to cram it in there in the first place.
Oh, fromfile() has the same issues.
Chris
Or is there a good alternative to `np.fromstring(<bytes>, dtype=...)`? --
Marten

Post by Chris Barker
There was just a question about a bug/issue with scipy.fromstring (which
is numpy.fromstring) when used to read integers from a text file.
https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html
fromstring() is bugging and inflexible for reading text files -- and it
is a very, very ugly mess of code. I dug into it a while back, and gave up
-- just to much of a mess!
So we really should completely re-implement it, or deprecate it. I doubt
anyone is going to do a big refactor, so that means deprecating it.
Also -- if we do want a fast read numbers from text files function (which
would be nice, actually), it really should get a new name anyway.
(and the hopefully coming new dtype system would make it easier to write
cleanly)
I'm not sure what deprecating something means, though -- have it raise a
deprecation warning in the next version?

There was discussion at SciPy 2015 of separating out the text reading
abilities of Pandas so that numpy could include it. We should contact Jeff
Rebeck and see about moving that forward.

Chuck

Jeff Reback

2015-10-23 22:30:39 UTC

Permalink

Post by Marten van Kerkwijk
I think it would be good to keep the usage to read binary data at least.

Agreed -- it's only the text file reading I'm proposing to deprecate. It was kind of weird to cram it in there in the first place.
Oh, fromfile() has the same issues.
Chris

Post by Marten van Kerkwijk
Or is there a good alternative to `np.fromstring(<bytes>, dtype=...)`? -- Marten

There was just a question about a bug/issue with scipy.fromstring (which is numpy.fromstring) when used to read integers from a text file.
https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html
fromstring() is bugging and inflexible for reading text files -- and it is a very, very ugly mess of code. I dug into it a while back, and gave up -- just to much of a mess!
So we really should completely re-implement it, or deprecate it. I doubt anyone is going to do a big refactor, so that means deprecating it.
Also -- if we do want a fast read numbers from text files function (which would be nice, actually), it really should get a new name anyway.
(and the hopefully coming new dtype system would make it easier to write cleanly)
I'm not sure what deprecating something means, though -- have it raise a deprecation warning in the next version?

There was discussion at SciPy 2015 of separating out the text reading abilities of Pandas so that numpy could include it. We should contact Jeff Rebeck and see about moving that forward.
Chuck
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion

IIRC Thomas Caswell was interested in doing this :)

Jeff

Nathaniel Smith

2015-10-23 22:49:06 UTC

Permalink

Post by Jeff Reback

Post by Charles R Harris
On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal <

Post by Chris Barker - NOAA Federal

Post by Marten van Kerkwijk
I think it would be good to keep the usage to read binary data at least.

Agreed -- it's only the text file reading I'm proposing to deprecate.

It was kind of weird to cram it in there in the first place.