Discussion:
[Numpy-discussion] Behavior of np.random.uniform
Jaime Fernández del Río
2016-01-19 10:10:51 UTC
Permalink
Hi all,

There is a PR (#7026 <https://github.com/numpy/numpy/pull/7026>) that
documents the current behavior of np.random.uniform when the low and high
parameters it takes do not conform to the expected low < high. Basically:

- if low < high, random numbers are drawn from [low, high),
- if low = high, all random numbers will be equal to low, and
- if low > high, random numbers are drawn from (high, low] (notice the
change in the open side of the interval.)

My only worry is that, once we document this, we can no longer claim that
it is a bug. So I would like to hear from others what do they think. The
other more or less obvious options would be to:

- Raise an error, but this would require a deprecation cycle, as people
may be relying on the current undocumented behavior.
- Check the inputs and draw numbers from [min(low, high), max(low, high)),
which is minimally different from current behavior.

I will be merging the current documentation changes in the next few days,
so it would be good if any concerns were voiced before that.

Thanks,

Jaime
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
Benjamin Root
2016-01-19 13:36:46 UTC
Permalink
Are there other functions where this behavior may or may not be happening?
If it isn't consistent across all np.random functions, it probably should
be, one way or the other.

Ben Root

On Tue, Jan 19, 2016 at 5:10 AM, Jaime Fernández del Río <
Post by Jaime Fernández del Río
Hi all,
There is a PR (#7026 <https://github.com/numpy/numpy/pull/7026>) that
documents the current behavior of np.random.uniform when the low and high
- if low < high, random numbers are drawn from [low, high),
- if low = high, all random numbers will be equal to low, and
- if low > high, random numbers are drawn from (high, low] (notice the
change in the open side of the interval.)
My only worry is that, once we document this, we can no longer claim that
it is a bug. So I would like to hear from others what do they think. The
- Raise an error, but this would require a deprecation cycle, as
people may be relying on the current undocumented behavior.
- Check the inputs and draw numbers from [min(low, high), max(low,
high)), which is minimally different from current behavior.
I will be merging the current documentation changes in the next few days,
so it would be good if any concerns were voiced before that.
Thanks,
Jaime
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
G Young
2016-01-19 14:21:44 UTC
Permalink
Of the methods defined in *numpy/mtrand.pyx* (excluding helper functions
and *random_integers*, as they are all related to *randint*), *randint* is
the only other function with *low* and *high* parameters. However, it
enforces *high* > *low*.

Greg
Post by Benjamin Root
Are there other functions where this behavior may or may not be happening?
If it isn't consistent across all np.random functions, it probably should
be, one way or the other.
Ben Root
On Tue, Jan 19, 2016 at 5:10 AM, Jaime Fernández del Río <
Post by Jaime Fernández del Río
Hi all,
There is a PR (#7026 <https://github.com/numpy/numpy/pull/7026>) that
documents the current behavior of np.random.uniform when the low and high
- if low < high, random numbers are drawn from [low, high),
- if low = high, all random numbers will be equal to low, and
- if low > high, random numbers are drawn from (high, low] (notice
the change in the open side of the interval.)
My only worry is that, once we document this, we can no longer claim that
it is a bug. So I would like to hear from others what do they think. The
- Raise an error, but this would require a deprecation cycle, as
people may be relying on the current undocumented behavior.
- Check the inputs and draw numbers from [min(low, high), max(low,
high)), which is minimally different from current behavior.
I will be merging the current documentation changes in the next few days,
so it would be good if any concerns were voiced before that.
Thanks,
Jaime
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Marten van Kerkwijk
2016-01-19 14:39:09 UTC
Permalink
For what it is worth, the current behaviour seems the most logical to me,
i.e., that the first limit is always the one that is included in the
interval, and the second is not. -- Marten
​
Chris Barker - NOAA Federal
2016-01-19 16:23:55 UTC
Permalink
What does the standard lib do for rand range? I see that randint Is closed
on both ends, so order doesn't matter, though if it raises for b<a, then
that's a precedent we could follow.

(Sorry, on a phone, can't check)

CHB



On Jan 19, 2016, at 6:21 AM, G Young <***@gmail.com> wrote:

Of the methods defined in *numpy/mtrand.pyx* (excluding helper functions
and *random_integers*, as they are all related to *randint*), *randint* is
the only other function with *low* and *high* parameters. However, it
enforces *high* > *low*.

Greg
Post by Benjamin Root
Are there other functions where this behavior may or may not be happening?
If it isn't consistent across all np.random functions, it probably should
be, one way or the other.
Ben Root
On Tue, Jan 19, 2016 at 5:10 AM, Jaime Fernández del Río <
Post by Jaime Fernández del Río
Hi all,
There is a PR (#7026 <https://github.com/numpy/numpy/pull/7026>) that
documents the current behavior of np.random.uniform when the low and high
- if low < high, random numbers are drawn from [low, high),
- if low = high, all random numbers will be equal to low, and
- if low > high, random numbers are drawn from (high, low] (notice
the change in the open side of the interval.)
My only worry is that, once we document this, we can no longer claim that
it is a bug. So I would like to hear from others what do they think. The
- Raise an error, but this would require a deprecation cycle, as
people may be relying on the current undocumented behavior.
- Check the inputs and draw numbers from [min(low, high), max(low,
high)), which is minimally different from current behavior.
I will be merging the current documentation changes in the next few days,
so it would be good if any concerns were voiced before that.
Thanks,
Jaime
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
G Young
2016-01-19 16:28:59 UTC
Permalink
In rand range, it raises an exception if low >= high.

I should also add that AFAIK enforcing low >= high with floats is a lot
trickier than it is for integers. I have been knee-deep in corner cases
for some time with *randint* where numbers that are visually different are
cast as the same number by *numpy* due to rounding and representation
issues. That situation only gets worse with floats.

Greg

On Tue, Jan 19, 2016 at 4:23 PM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is closed
on both ends, so order doesn't matter, though if it raises for b<a, then
that's a precedent we could follow.
(Sorry, on a phone, can't check)
CHB
Of the methods defined in *numpy/mtrand.pyx* (excluding helper functions
and *random_integers*, as they are all related to *randint*), *randint* is
the only other function with *low* and *high* parameters. However, it
enforces *high* > *low*.
Greg
Post by Benjamin Root
Are there other functions where this behavior may or may not be
happening? If it isn't consistent across all np.random functions, it
probably should be, one way or the other.
Ben Root
On Tue, Jan 19, 2016 at 5:10 AM, Jaime Fernández del Río <
Post by Jaime Fernández del Río
Hi all,
There is a PR (#7026 <https://github.com/numpy/numpy/pull/7026>) that
documents the current behavior of np.random.uniform when the low and
high parameters it takes do not conform to the expected low < high.
- if low < high, random numbers are drawn from [low, high),
- if low = high, all random numbers will be equal to low, and
- if low > high, random numbers are drawn from (high, low] (notice
the change in the open side of the interval.)
My only worry is that, once we document this, we can no longer claim
that it is a bug. So I would like to hear from others what do they think.
- Raise an error, but this would require a deprecation cycle, as
people may be relying on the current undocumented behavior.
- Check the inputs and draw numbers from [min(low, high), max(low,
high)), which is minimally different from current behavior.
I will be merging the current documentation changes in the next few
days, so it would be good if any concerns were voiced before that.
Thanks,
Jaime
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
planes de dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Sebastian Berg
2016-01-19 17:35:28 UTC
Permalink
Post by G Young
In rand range, it raises an exception if low >= high.
I should also add that AFAIK enforcing low >= high with floats is a
lot trickier than it is for integers. I have been knee-deep in
corner cases for some time with randint where numbers that are
visually different are cast as the same number by numpy due to
rounding and representation issues. That situation only gets worse
with floats.
Well, actually random.uniform docstring says:

Get a random number in the range [a, b) or [a, b] depending on
rounding.

and is true to the word, it does not care about the relative value of a
vs. b. So my guess it is identical to your version (though one could
check a bit more careful with corner cases)
Quick check would suggests it is the same (though I guess if there was
Post by G Young
Post by Chris Barker - NOAA Federal
np.random.set_state(('MT19937', random.getstate()[1][:-1], random.getstate()[1][-1]))
Will enable you to draw the same numbers with random.uniform and
np.random.uniform.

- Sebastian
Post by G Young
Greg
On Tue, Jan 19, 2016 at 4:23 PM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is
closed on both ends, so order doesn't matter, though if it raises
for b<a, then that's a precedent we could follow.
(Sorry, on a phone, can't check)
CHB
Of the methods defined in numpy/mtrand.pyx (excluding helper
functions and random_integers, as they are all related to
randint), randint is the only other function with low and high
parameters. However, it enforces high > low.
Greg
On Tue, Jan 19, 2016 at 1:36 PM, Benjamin Root <
Post by Benjamin Root
Are there other functions where this behavior may or may not be
happening? If it isn't consistent across all np.random
functions, it probably should be, one way or the other.
Ben Root
On Tue, Jan 19, 2016 at 5:10 AM, Jaime Fernández del Río <
Post by Jaime Fernández del Río
Hi all,
There is a PR (#7026) that documents the current behavior of
np.random.uniform when the low and high parameters it takes
if low < high, random numbers are drawn from [low, high),
if low = high, all random numbers will be equal to low, and
if low > high, random numbers are drawn from (high, low]
(notice the change in the open side of the interval.)
My only worry is that, once we document this, we can no
longer claim that it is a bug. So I would like to hear from
others what do they think. The other more or less obvious
Raise an error, but this would require a deprecation cycle,
as people may be relying on the current undocumented
behavior.
Check the inputs and draw numbers from [min(low, high),
max(low, high)), which is minimally different from current
behavior.
I will be merging the current documentation changes in the
next few days, so it would be good if any concerns were
voiced before that.
Thanks,
Jaime
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale
en sus planes de dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Robert Kern
2016-01-21 09:38:45 UTC
Permalink
Post by Sebastian Berg
Post by G Young
In rand range, it raises an exception if low >= high.
I should also add that AFAIK enforcing low >= high with floats is a
lot trickier than it is for integers. I have been knee-deep in
corner cases for some time with randint where numbers that are
visually different are cast as the same number by numpy due to
rounding and representation issues. That situation only gets worse
with floats.
Get a random number in the range [a, b) or [a, b] depending on
rounding.
Which docstring are you looking at? The current one says [low, high)

http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.uniform.html#numpy.random.uniform

--
Robert Kern
Sebastian Berg
2016-01-21 10:07:52 UTC
Permalink
On Tue, Jan 19, 2016 at 5:35 PM, Sebastian Berg <
Post by Sebastian Berg
Post by G Young
In rand range, it raises an exception if low >= high.
I should also add that AFAIK enforcing low >= high with floats is
a
Post by Sebastian Berg
Post by G Young
lot trickier than it is for integers. I have been knee-deep in
corner cases for some time with randint where numbers that are
visually different are cast as the same number by numpy due to
rounding and representation issues. That situation only gets
worse
Post by Sebastian Berg
Post by G Young
with floats.
Get a random number in the range [a, b) or [a, b] depending on
rounding.
Which docstring are you looking at? The current one says [low, high)
Sorry, I was referring to the python random.uniform function. And as
far as I now understand the current numpy equivalent (intentionally or
not) seems to do the same thing, which suits fine with me.

- Sebastian
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.unif
orm.html#numpy.random.uniform
--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Charles R Harris
2016-01-19 17:27:03 UTC
Permalink
On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is closed
on both ends, so order doesn't matter, though if it raises for b<a, then
that's a precedent we could follow.
randint is not closed on the high end. The now deprecated random_integers
is the function that does that.

For floats, it's good to have various interval options. For instance, in
generating numbers that will be inverted or have their log taken it is good
to avoid zero. However, the names 'low' and 'high' are misleading...

Chuck
Charles R Harris
2016-01-19 17:35:38 UTC
Permalink
On Tue, Jan 19, 2016 at 10:27 AM, Charles R Harris <
Post by Charles R Harris
On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is
closed on both ends, so order doesn't matter, though if it raises for b<a,
then that's a precedent we could follow.
randint is not closed on the high end. The now deprecated random_integers
is the function that does that.
For floats, it's good to have various interval options. For instance, in
generating numbers that will be inverted or have their log taken it is good
to avoid zero. However, the names 'low' and 'high' are misleading...
Note that we also have both arange and linspace for floats. What works well
for integers is not always the best for floats.

Chuck
Robert Kern
2016-01-19 17:36:08 UTC
Permalink
Post by Charles R Harris
On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is
closed on both ends, so order doesn't matter, though if it raises for b<a,
then that's a precedent we could follow.
Post by Charles R Harris
randint is not closed on the high end. The now deprecated random_integers
is the function that does that.
Post by Charles R Harris
For floats, it's good to have various interval options. For instance, in
generating numbers that will be inverted or have their log taken it is good
to avoid zero. However, the names 'low' and 'high' are misleading...

They are correctly leading the users to the manner in which the author
intended the function to be used. The *implementation* is misleading by
allowing users to do things contrary to the documented intent. ;-)

With floating point and general intervals, there is not really a good way
to guarantee that the generated results avoid the "open" end of the
specified interval or even stay *within* that interval. This function is
definitely not intended to be used as `uniform(closed_end, open_end)`.

--
Robert Kern
Charles R Harris
2016-01-19 17:40:33 UTC
Permalink
On Tue, Jan 19, 2016 at 5:27 PM, Charles R Harris <
Post by Charles R Harris
On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is
closed on both ends, so order doesn't matter, though if it raises for b<a,
then that's a precedent we could follow.
Post by Charles R Harris
randint is not closed on the high end. The now deprecated
random_integers is the function that does that.
Post by Charles R Harris
For floats, it's good to have various interval options. For instance, in
generating numbers that will be inverted or have their log taken it is good
to avoid zero. However, the names 'low' and 'high' are misleading...
They are correctly leading the users to the manner in which the author
intended the function to be used. The *implementation* is misleading by
allowing users to do things contrary to the documented intent. ;-)
With floating point and general intervals, there is not really a good way
to guarantee that the generated results avoid the "open" end of the
specified interval or even stay *within* that interval. This function is
definitely not intended to be used as `uniform(closed_end, open_end)`.
Well, it is possible to make that happen if one is careful or directly sets
the bits in ieee types...

Chuck
Robert Kern
2016-01-19 17:42:24 UTC
Permalink
Post by Charles R Harris
On Tue, Jan 19, 2016 at 5:27 PM, Charles R Harris <
Post by Charles R Harris
On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is
closed on both ends, so order doesn't matter, though if it raises for b<a,
then that's a precedent we could follow.
Post by Charles R Harris
Post by Charles R Harris
randint is not closed on the high end. The now deprecated
random_integers is the function that does that.
Post by Charles R Harris
Post by Charles R Harris
For floats, it's good to have various interval options. For instance,
in generating numbers that will be inverted or have their log taken it is
good to avoid zero. However, the names 'low' and 'high' are misleading...
Post by Charles R Harris
They are correctly leading the users to the manner in which the author
intended the function to be used. The *implementation* is misleading by
allowing users to do things contrary to the documented intent. ;-)
Post by Charles R Harris
With floating point and general intervals, there is not really a good
way to guarantee that the generated results avoid the "open" end of the
specified interval or even stay *within* that interval. This function is
definitely not intended to be used as `uniform(closed_end, open_end)`.
Post by Charles R Harris
Well, it is possible to make that happen if one is careful or directly
sets the bits in ieee types...

For the unit interval, certainly. For general bounds, I am not so sure.

--
Robert Kern
Charles R Harris
2016-01-19 17:43:55 UTC
Permalink
On Tue, Jan 19, 2016 at 5:40 PM, Charles R Harris <
Post by Charles R Harris
On Tue, Jan 19, 2016 at 5:27 PM, Charles R Harris <
Post by Charles R Harris
On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is
closed on both ends, so order doesn't matter, though if it raises for b<a,
then that's a precedent we could follow.
Post by Charles R Harris
Post by Charles R Harris
randint is not closed on the high end. The now deprecated
random_integers is the function that does that.
Post by Charles R Harris
Post by Charles R Harris
For floats, it's good to have various interval options. For instance,
in generating numbers that will be inverted or have their log taken it is
good to avoid zero. However, the names 'low' and 'high' are misleading...
Post by Charles R Harris
They are correctly leading the users to the manner in which the author
intended the function to be used. The *implementation* is misleading by
allowing users to do things contrary to the documented intent. ;-)
Post by Charles R Harris
With floating point and general intervals, there is not really a good
way to guarantee that the generated results avoid the "open" end of the
specified interval or even stay *within* that interval. This function is
definitely not intended to be used as `uniform(closed_end, open_end)`.
Post by Charles R Harris
Well, it is possible to make that happen if one is careful or directly
sets the bits in ieee types...
For the unit interval, certainly. For general bounds, I am not so sure.
Point taken.

Chuck
j***@gmail.com
2016-01-19 17:49:28 UTC
Permalink
On Tue, Jan 19, 2016 at 12:43 PM, Charles R Harris <
Post by Charles R Harris
On Tue, Jan 19, 2016 at 5:40 PM, Charles R Harris <
Post by Charles R Harris
On Tue, Jan 19, 2016 at 5:27 PM, Charles R Harris <
Post by Charles R Harris
On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is
closed on both ends, so order doesn't matter, though if it raises for b<a,
then that's a precedent we could follow.
Post by Charles R Harris
Post by Charles R Harris
randint is not closed on the high end. The now deprecated
random_integers is the function that does that.
Post by Charles R Harris
Post by Charles R Harris
For floats, it's good to have various interval options. For
instance, in generating numbers that will be inverted or have their log
taken it is good to avoid zero. However, the names 'low' and 'high' are
misleading...
Post by Charles R Harris
They are correctly leading the users to the manner in which the author
intended the function to be used. The *implementation* is misleading by
allowing users to do things contrary to the documented intent. ;-)
Post by Charles R Harris
With floating point and general intervals, there is not really a good
way to guarantee that the generated results avoid the "open" end of the
specified interval or even stay *within* that interval. This function is
definitely not intended to be used as `uniform(closed_end, open_end)`.
Post by Charles R Harris
Well, it is possible to make that happen if one is careful or directly
sets the bits in ieee types...
For the unit interval, certainly. For general bounds, I am not so sure.
Point taken.
What's the practical importance of this. The boundary points have
probability zero, theoretically.

What happens if low and high are only a few nulps apart?

If you clip the distribution to obey boundary rules you create mass points
:)

Josef
Post by Charles R Harris
Chuck
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
Robert Kern
2016-01-19 17:41:00 UTC
Permalink
On Tue, Jan 19, 2016 at 5:27 PM, Charles R Harris <
Post by Charles R Harris
On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is
closed on both ends, so order doesn't matter, though if it raises for b<a,
then that's a precedent we could follow.
Post by Charles R Harris
randint is not closed on the high end. The now deprecated
random_integers is the function that does that.
Post by Charles R Harris
For floats, it's good to have various interval options. For instance,
in generating numbers that will be inverted or have their log taken it is
good to avoid zero. However, the names 'low' and 'high' are misleading...
They are correctly leading the users to the manner in which the author
intended the function to be used. The *implementation* is misleading by
allowing users to do things contrary to the documented intent. ;-)
With floating point and general intervals, there is not really a good way
to guarantee that the generated results avoid the "open" end of the
specified interval or even stay *within* that interval. This function is
definitely not intended to be used as `uniform(closed_end, open_end)`.

There are special cases that *can* be implemented and are worth doing so as
they are building blocks for other distributions that do need to avoid 0 or
1 as you say. Full-featured RNG suites do offer these:

[0, 1]
[0, 1)
(0, 1]
(0, 1)

--
Robert Kern
Chris Barker
2016-01-19 17:51:54 UTC
Permalink
Post by Charles R Harris
On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal <
Post by Chris Barker - NOAA Federal
What does the standard lib do for rand range? I see that randint Is
closed on both ends, so order doesn't matter, though if it raises for b<a,
then that's a precedent we could follow.
randint is not closed on the high end. The now deprecated random_integers
is the function that does that.
I was referring to the stdlib randint:

In [*7*]: [random.randint(2,5) for i in range(10)]

Out[*7*]: [5, 5, 2, 4, 5, 5, 3, 5, 2, 2]
thinking that compatibility with that would be good -- but that ship has
sailed.

randrange is open on the high end, as range is, (duh!)

[random.randrange(2,5) for i in range(10)]

Out[*9*]: [3, 3, 4, 3, 3, 2, 4, 2, 3, 3]
but you get an exception is low >=high:

In [*10*]: random.randrange(5,2)

ValueError: empty range for randrange() (5,2, -3)
I like the exception idea best -- but backward compatibility and all that.

-CHB
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
Alan G Isaac
2016-01-19 17:23:27 UTC
Permalink
In principle, if we are describing an *interval*, that is
the right thing to do:
https://en.wikipedia.org/wiki/Interval_(mathematics)#Including_or_excluding_endpoints

Alan Isaac
Of the methods defined in *numpy/mtrand.pyx* (excluding helper functions and *random_integers*, as they are all related to *randint*), *randint*//is
the only other function with /low/ and /high/ parameters. However, it enforces /high/ > /low/.
Peter Creasey
2016-01-20 18:57:35 UTC
Permalink
+1 for the deprecation warning for low>high, I think the cases where
that is called are more likely to be unintentional rather than someone
trying to use uniform(closed_end, open_end) and you might help users
find bugs - i.e. the idioms of ‘explicit is better than implicit’ and
‘fail early and fail loudly’ apply.

I would also point out that requiring open vs closed intervals (in
doubles) is already an extremely specialised use case. In terms of
*sampling the reals*, there is no difference between the intervals
(a,b) and [a,b], because the endpoints have measure 0, and even with
double-precision arithmetic, you are going to have to make several
petabytes of random data before you hit an endpoint...

Peter
Charles R Harris
2016-01-20 19:07:55 UTC
Permalink
On Wed, Jan 20, 2016 at 11:57 AM, Peter Creasey <
Post by Peter Creasey
+1 for the deprecation warning for low>high, I think the cases where
that is called are more likely to be unintentional rather than someone
trying to use uniform(closed_end, open_end) and you might help users
find bugs - i.e. the idioms of ‘explicit is better than implicit’ and
‘fail early and fail loudly’ apply.
I would also point out that requiring open vs closed intervals (in
doubles) is already an extremely specialised use case. In terms of
*sampling the reals*, there is no difference between the intervals
(a,b) and [a,b], because the endpoints have measure 0, and even with
double-precision arithmetic, you are going to have to make several
petabytes of random data before you hit an endpoint...
Petabytes ain't what they used to be ;) I remember testing some hardware
which, due to grounding/timing issues would occasionally goof up a readable
register. The hardware designers never saw it because they didn't test for
hours and days at high data rates. But it was there, and it would show up
in the data. Measure zero is about as real as real numbers...

Chuck
Peter Creasey
2016-01-21 01:55:25 UTC
Permalink
Post by Charles R Harris
Post by Peter Creasey
I would also point out that requiring open vs closed intervals (in
doubles) is already an extremely specialised use case. In terms of
*sampling the reals*, there is no difference between the intervals
(a,b) and [a,b], because the endpoints have measure 0, and even with
double-precision arithmetic, you are going to have to make several
petabytes of random data before you hit an endpoint...
Petabytes ain't what they used to be ;) I remember testing some hardware
which, due to grounding/timing issues would occasionally goof up a readable
register. The hardware designers never saw it because they didn't test for
hours and days at high data rates. But it was there, and it would show up
in the data. Measure zero is about as real as real numbers...
Chuck
Actually, your point is well taken and I am quite mistaken. If you
pick some values like uniform(low, low * (1+2**-52)) then you can hit
your endpoints pretty easily. I am out of practice making
pathological tests for double precision arithmetic.

I guess my suggestion would be to add the deprecation warning and
change the docstring to warn that the interval is not guaranteed to be
right-open.

Peter
Jaime Fernández del Río
2016-01-21 07:06:10 UTC
Permalink
There doesn't seem to be much of a consensus on the way to go, so leaving
things as they are and have been seems the wisest choice for now, thanks
for all the feedback. I will work with Greg on documenting the status quo
properly.

We probably want to follow the lead of the stdlib's random.uniform on how
the openness of the interval actually results in practice:

https://docs.python.org/3.6/library/random.html#random.uniform


On Thu, Jan 21, 2016 at 2:55 AM, Peter Creasey <
Post by Charles R Harris
Post by Charles R Harris
Post by Peter Creasey
I would also point out that requiring open vs closed intervals (in
doubles) is already an extremely specialised use case. In terms of
*sampling the reals*, there is no difference between the intervals
(a,b) and [a,b], because the endpoints have measure 0, and even with
double-precision arithmetic, you are going to have to make several
petabytes of random data before you hit an endpoint...
Petabytes ain't what they used to be ;) I remember testing some hardware
which, due to grounding/timing issues would occasionally goof up a
readable
Post by Charles R Harris
register. The hardware designers never saw it because they didn't test
for
Post by Charles R Harris
hours and days at high data rates. But it was there, and it would show up
in the data. Measure zero is about as real as real numbers...
Chuck
Actually, your point is well taken and I am quite mistaken. If you
pick some values like uniform(low, low * (1+2**-52)) then you can hit
your endpoints pretty easily. I am out of practice making
pathological tests for double precision arithmetic.
I guess my suggestion would be to add the deprecation warning and
change the docstring to warn that the interval is not guaranteed to be
right-open.
Peter
_______________________________________________
NumPy-Discussion mailing list
https://mail.scipy.org/mailman/listinfo/numpy-discussion
--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
Robert Kern
2016-01-21 09:35:16 UTC
Permalink
On Thu, Jan 21, 2016 at 7:06 AM, Jaime Fernández del Río <
Post by Jaime Fernández del Río
There doesn't seem to be much of a consensus on the way to go, so leaving
things as they are and have been seems the wisest choice for now, thanks
for all the feedback. I will work with Greg on documenting the status quo
properly.

Ugh. Be careful in documenting the way things currently work. No one
intended for it to work that way! No one should rely on high<low being
allowed in the future. I'm tempted to just make a PR adding the deprecation
lack-of-consensus-be-damned. It is currently documented correctly for the
way it is supposed to be used, and that is a good thing. Please relegate
discussions of the counter-documented behavior to the Notes.

--
Robert Kern
Loading...