Stephan Hoyer
2015-10-12 03:38:15 UTC
Currently, NaT (not a time) does not have any special treatment when used
in comparison with datetime64/timedelta64 objects.
This means that it's equal to itself, and treated as the smallest possible
value in comparisons, e.g., NaT == NaT and NaT < any_other_time.
To me, this seems a little crazy for a value meant to denote a
missing/invalid time -- NaT should really have the same comparison behavior
as NaN. That is, all comparisons with NaT should be false. The good news is
that updating this behavior turns out to be only a matter of adding a
single conditional to umath/loops.c.src -- most of the work would be fixing
tests.
Whether you call this an API change or a bug fix is somewhat of a judgment
call, but I believe this change is certainly consistent with the goals of
datetime64. It's also consistent with how NaT is used in pandas, which uses
its own wrappers around datetime64 precisely to fix these sorts of issues.
So I'm raising this here to get some opinions on the right path forward:
1. Is this a bug fix that we can backport to 1.10.x?
2. Is this an API change that should wait until 1.11?
3. Is this something where we need to start issuing warnings and deprecate
the existing behavior?
My vote would be for option 2. I think it's really a bug fix, but it would
break enough code that I wouldn't want to spring this on anybody in a bug
fix release. I'd rather not wait several releases on this one because that
will only exacerbate issues with being able to use datetime64 reliably.
Stephan
in comparison with datetime64/timedelta64 objects.
This means that it's equal to itself, and treated as the smallest possible
value in comparisons, e.g., NaT == NaT and NaT < any_other_time.
To me, this seems a little crazy for a value meant to denote a
missing/invalid time -- NaT should really have the same comparison behavior
as NaN. That is, all comparisons with NaT should be false. The good news is
that updating this behavior turns out to be only a matter of adding a
single conditional to umath/loops.c.src -- most of the work would be fixing
tests.
Whether you call this an API change or a bug fix is somewhat of a judgment
call, but I believe this change is certainly consistent with the goals of
datetime64. It's also consistent with how NaT is used in pandas, which uses
its own wrappers around datetime64 precisely to fix these sorts of issues.
So I'm raising this here to get some opinions on the right path forward:
1. Is this a bug fix that we can backport to 1.10.x?
2. Is this an API change that should wait until 1.11?
3. Is this something where we need to start issuing warnings and deprecate
the existing behavior?
My vote would be for option 2. I think it's really a bug fix, but it would
break enough code that I wouldn't want to spring this on anybody in a bug
fix release. I'd rather not wait several releases on this one because that
will only exacerbate issues with being able to use datetime64 reliably.
Stephan