python: Incorrect(?) time difference between `datetime` objects spanning a daylight saving time change

110 Views Asked by At

(edited to add)

PROLOG: Today I became aware of the concept of "wall time". I will always and forever consider it harmful.


I have two datetimes, one representing a certain time-of-day just before a time change, and the other representing the same time-of-day one calendar day after the the first datetime. I would expect the time difference between these two objects to not be exactly one day, but that's what I see (with python3.8, which is all I have to work with).

Taking the difference of the timestamps associated with the datetimes returns exactly what I would expect to see. Taking the difference of datetime objects when they span a time change looks flat-out wrong to me.

Is this expected behavior?

from datetime import datetime, timedelta
from dateutil.tz import gettz # pip install dateutil

def iso(dt):
    return dt.strftime('%FT%T%z')

# Daylight saving time begins at 2 a.m. local time on Sunday, March 10, 2024
us_central = gettz('US/Central')
before = datetime(2024, 3,  9, 15, 22, 1, tzinfo=us_central)
after  = datetime(2024, 3, 10, 15, 22, 1, tzinfo=us_central)

print()
print(f'before time change: {iso(before)}')
print(f' after time change: {iso(after)}')

naive = after - before
by_timestamps = timedelta(seconds = after.timestamp() - before.timestamp())
difference_difference = naive.total_seconds() - by_timestamps.total_seconds()

print()
print('Differences:')
print(f'        naive: {repr(naive)}')
print(f'by timestamps: {repr(by_timestamps)}')
print(f'        error: {difference_difference}s')

Output (python3.8)

before time change: 2024-03-09T15:22:01-0600
 after time change: 2024-03-10T15:22:01-0500

Differences:
        naive: datetime.timedelta(days=1)
by timestamps: datetime.timedelta(seconds=82800)
        error: 3600.0s

(Edited to add)

This is so counterintuitive to me. "Wall time" is very strange.

from datetime import datetime, timedelta
from dateutil.tz import gettz # pip install dateutil

# Daylight saving time begins at 2 a.m. local time on Sunday, March 10, 2024
us_central = gettz('US/Central')
before = datetime(2024, 3,  9, 15, 22, 1, tzinfo=us_central)
after  = datetime(2024, 3, 10, 15, 22, 1, tzinfo=us_central)

print(repr((before + timedelta(days=1)) - after))
print(repr((before + timedelta(seconds=86400)) - after))

# what I think the above should do...
print(repr(datetime.fromtimestamp(before.timestamp() + 86400, tz=before.tzinfo) - after))

Output (python3.8)

datetime.timedelta(0)
datetime.timedelta(0)
datetime.timedelta(seconds=3600)
2

There are 2 best solutions below

4
BoppreH On

From cpython's source code (the pure Python fallback module, though I expect the results to be equivalent to the C code):

class datetime:
    ...
    def __sub__(self, other):
        "Subtract two datetimes, or a datetime and a timedelta."
        if not isinstance(other, datetime):
            if isinstance(other, timedelta):
                return self + -other
            return NotImplemented

        days1 = self.toordinal()
        days2 = other.toordinal()
        secs1 = self._second + self._minute * 60 + self._hour * 3600
        secs2 = other._second + other._minute * 60 + other._hour * 3600
        base = timedelta(days1 - days2,
                         secs1 - secs2,
                         self._microsecond - other._microsecond)
        if self._tzinfo is other._tzinfo:
            return base
        myoff = self.utcoffset()
        otoff = other.utcoffset()
        if myoff == otoff:
            return base
        if myoff is None or otoff is None:
            raise TypeError("cannot mix naive and timezone-aware time")
        return base + otoff - myoff

Note that there's no code to account for daylight saving time changes.

Therefore I'd suggest localizing your datetime objects to UTC before performing any timedelta computation.

And if your code needs to be accurate to the second, note that Python also ignores leap seconds, and even Unix time (generated by .timestamp()) makes bad choices. Subtracting two Unix timestamps around a leap second will be off by one, because even though the timestamps look independent of calendar, they were patched for backwards compatibility with code that assumed a fixed amount of seconds per day.

1
Mark Tolonen On

Do time difference calculations in UTC otherwise "wall" time is used. Below fixes your code, and has an example of advancing by 15-minute increments over the end of daylight savings time (Nov 3, 2024, 2am rolls back to 1am). It also uses the newer built-in zoneinfo module (available since Python 3.9) which uses either the system time zone information or falls back to the 3rd party tzdata module for the latest time zone information:

# May need "pip install -U tzdata" on some OSes for latest time zone info.
import datetime as dt
import zoneinfo as zi

# Daylight saving time begins at 2 a.m. local time on Sunday, March 10, 2024
us_central = zi.ZoneInfo('US/Central')

# Compute times as a local time zone but convert to UTC.
before = dt.datetime(2024, 3,  9, 15, 22, 1, tzinfo=us_central).astimezone(dt.timezone.utc)
after  = dt.datetime(2024, 3, 10, 15, 22, 1, tzinfo=us_central).astimezone(dt.timezone.utc)

# Display in preferred time zone
print()
print(f'before time change: {before.astimezone(us_central)}')
print(f' after time change: {after.astimezone(us_central)}')

naive = after - before
by_timestamps = dt.timedelta(seconds = after.timestamp() - before.timestamp())
difference_difference = naive.total_seconds() - by_timestamps.total_seconds()

print()
print('Differences:')
print(f'        naive: {repr(naive)}')
print(f'by timestamps: {repr(by_timestamps)}')
print(f'        error: {difference_difference}s')

current = dt.datetime(2024, 11, 3, 1, tzinfo=us_central).astimezone(dt.timezone.utc)
for minutes in range(12):
    print(current.astimezone(us_central))
    current += dt.timedelta(seconds=15 * 60)

Output (with notations):


before time change: 2024-03-09 15:22:01-06:00
 after time change: 2024-03-10 15:22:01-05:00

Differences:
        naive: datetime.timedelta(seconds=82800)
by timestamps: datetime.timedelta(seconds=82800)
        error: 0.0s
2024-11-03 01:00:00-05:00
2024-11-03 01:15:00-05:00
2024-11-03 01:30:00-05:00
2024-11-03 01:45:00-05:00
2024-11-03 01:00:00-06:00
2024-11-03 01:15:00-06:00
2024-11-03 01:30:00-06:00
2024-11-03 01:45:00-06:00
2024-11-03 02:00:00-06:00
2024-11-03 02:15:00-06:00
2024-11-03 02:30:00-06:00
2024-11-03 02:45:00-06:00

Also note that datetime.datetime.utcnow() and datetime.datetime.utcfromtimestamp() are not recommended to be used due to returning naive timestamps that use local system time (and therefore "wall" time) for math. They have also been deprecated in Python 3.12.