URL validation in Python - edge cases

252 Views Asked by Shili Ho At 21 June 2025 at 04:15

I'm trying to write/use URL validation in python by simply analyzing the string (no http requests) but I get a lot of edge cases with the different solutions I tried.

After looking at django's urlvalidator, I still have some edge cases that are misclassified:

def is_url_valid(url: str) -> bool:
    # from django urlvalidator
    url_pattern = re.compile(
        r'^(?:http|ftp)s?://'  # http:// or https://
        r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|'  # domain...
        r'localhost|'  # localhost...
        r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})'  # ...or ip
        r'(?::\d+)?'  # optional port
        r'(?:/?|[/?]\S+)$', re.IGNORECASE)

 

    return bool(re.match(url_pattern, url))

we still get:

>>> is_url_valid('contact.html')
True

Other approaches we tried:

validators (python package), recommended by this SO Q&A

>>> import validators
>>> validators.url("firespecialties.com/bullseyetwo.html") # this is a valid url
ValidationFailure(func=url, args={'value': 'firespecialties.com/bullseyetwo.html', 'public': False})

from this validating urls in python SO Q&A while urllib.parse.urlparse('contact.html') correctly assess it as a path, it fails with urllib.parse.urlparse('www.images.com/example.html')`:

>>> from urllib.parse import urlparse
>>> urlparse('www.images.com/example.html')
ParseResult(scheme='', netloc='', path='www.images.com/example.html', params='', query='', fragment='')

Adapting logic from this javascript SO Q&A

Original Q&A

URL validation in Python - edge cases

Other approaches we tried:

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in DJANGO

Related Questions in VALIDATION

Related Questions in URL-VALIDATION

Trending Questions

Popular # Hahtags

Popular Questions