Validate an ISO-8601 datetime string in Python?

34.2k Views Asked by At

I want to write a function that takes a string and returns True if it is a valid ISO-8601 datetime--precise to microseconds, including a timezone offset--False otherwise.

I have found other questions that provide different ways of parsing datetime strings, but I want to return True in the case of ISO-8601 format only. Parsing doesn't help me unless I can get it to throw an error for formats that don't match ISO-8601.

(I am using the nice arrow library elsewhere in my code. A solution that uses arrow would be welcome.)


EDIT: It appears that a general solution to "is this string a valid ISO 8601 datetime" does not exist among the common Python datetime packages.

So, to make this question narrower, more concrete and answerable, I will settle for a format string that will validate a datetime string in this form:

'2016-12-13T21:20:37.593194+00:00'

Currently I am using:

format_string = '%Y-%m-%dT%H:%M:%S.%f%z'
datetime.datetime.strptime(my_timestamp, format_string)

This gives:

ValueError: time data '2016-12-13T21:20:37.593194+00:00' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'

The problem seems to lie with the colon in the UTC offset (+00:00). If I use an offset without a colon (e.g. '2016-12-13T21:20:37.593194+0000'), this parses properly as expected. This is apparently because datetime's %z token does not respect the UTC offset form that has a colon, only the form without, even though both are valid per the spec.

5

There are 5 best solutions below

0
On
In [1]  import dateutil.parser as dp

In [2]: import re
     ...: def validate_iso8601_us(str_val):
     ...:     try:
     ...:         dp.parse(str_val)
     ...:         if re.search('\.\d\d\d\d\d\d',str_val):
     ...:             return True
     ...:     except:
     ...:         pass
     ...:     return False
     ...:

In [3]: validate_iso8601_us('2019/08/15T16:03:5.12345')
Out[3]: False

In [4]: validate_iso8601_us('2019/08/15T16:03:5.123456')
Out[4]: True

In [5]: validate_iso8601_us('2019/08/15T16:03:5.123456+4')
Out[5]: True

In [6]: validate_iso8601_us('woof2019/08/15T16:03:5.123456+4')
Out[6]: False
4
On

Recent versions of Python (from 3.7 onwards) have a fromisoformat() function in the datetime standard library. See: https://docs.python.org/3.7/library/datetime.html

So this will do the trick:

from datetime import datetime

def datetime_valid(dt_str):
    try:
        datetime.fromisoformat(dt_str)
    except:
        return False
    return True

Update:

I learned that Python does not recognize the 'Z'-suffix as valid. As I wanted to support this in my API, I'm now using (after incorporating Matt's feedback):

from datetime import datetime

def datetime_valid(dt_str):
    try:
        datetime.fromisoformat(dt_str.replace('Z', '+00:00'))
    except:
        return False
    return True
1
On

Given the constraints you've put on the problem, you could easily solve it with a regular expression.

>>> import re
>>> re.match(r'^\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d\.\d{6}[+-]\d\d:\d\d$', '2016-12-13T21:20:37.593194+00:00')
<_sre.SRE_Match object; span=(0, 32), match='2016-12-13T21:20:37.593194+00:00'>

If you need to pass all variations of ISO 8601 it will be a much more complicated regular expression, but it could still be done. If you also need to validate the numeric ranges, for example verifying that the hour is between 0 and 23, you can put parentheses into the regular expression to create match groups then validate each group.

0
On

Here is a crude but functional solution (for the narrower question) using datetime.strptime():

import datetime

def is_expected_datetime_format(timestamp):
    format_string = '%Y-%m-%dT%H:%M:%S.%f%z'
    try:
        colon = timestamp[-3]
        if not colon == ':':
            raise ValueError()
        colonless_timestamp = timestamp[:-3] + timestamp[-2:]
        datetime.datetime.strptime(colonless_timestamp, format_string)
        return True
    except ValueError:
        return False
3
On

https://www.safaribooksonline.com/library/view/regular-expressions-cookbook/9781449327453/ch04s07.html

give many variants for validating date and times in ISO8601 format (e.g., 2008-08-30T01:45:36 or 2008-08-30T01:45:36.123Z). The regex for the XML Schema dateTime type is given as:

>>> regex = r'^(-?(?:[1-9][0-9]*)?[0-9]{4})-(1[0-2]|0[1-9])-(3[01]|0[1-9]|[12][0-9])T(2[0-3]|[01][0-9]):([0-5][0-9]):([0-5][0-9])(\.[0-9]+)?(Z|[+-](?:2[0-3]|[01][0-9]):[0-5][0-9])?$'

So in order to validate you could do:

import re
match_iso8601 = re.compile(regex).match
def validate_iso8601(str_val):
    try:            
        if match_iso8601( str_val ) is not None:
            return True
    except:
        pass
    return False

Some examples:

>>> validate_iso8601('2017-01-01')
False

>>> validate_iso8601('2008-08-30T01:45:36.123Z')
True

>>> validate_iso8601('2016-12-13T21:20:37.593194+00:00')
True