I have a URL which could have any number of folders, and it ends with a filename.extension.
example:
https://cdn.example.com/user/image/upload/v87879798/images/profile/oaz4wkjkjsbzxa3xlkmu.jpg
I am trying to get everything after the /v87879798 version folded, thus:
images/profile/oaz4wkjkjsbzxa3xlkmu.jpg
I have tried a number of approaches as below, but nothing worked as I know I will most likely need a regular expression, but my knowledege of them don't allow me to construct such ones yet. Some of the methods I tried are:
import os
from urllib.parse import urlparse
# url https://cdn.example.com/user/image/upload/v87879798/images/profile/oaz4wkjkjsbzxa3xlkmu.jpg
parsed_url = urlparse(url)
# parsed_url.path /lang-code/image/upload/v1601568948/images/profile
path = os.path.dirname(parsed_url.path)
# file_name oaz4wkjkjsbzxa3xlkmu.jpg
file_name = os.path.basename()
But nothing that works so far. Any help would be greatly appreciated.
Edit: Sorry, forgot to mention What I meant by saying N number of folders is that any of the below urls could be possible:
https://cdn.example.com/user/image/upload/v87879798/images/profile/oaz4wkjkjsbzxa3xlkmu.jpg
https://cdn.example.com/user/image/upload/v87879798/images/oaz4wkjkjsbzxa3xlkmu.jpg
https://cdn.example.com/user/image/upload/v87879798/oaz4wkjkjsbzxa3xlkmu.jpg
Using Regex pattern -->
r"v\d+\/(.+)$"
. Assuming the random numbers start withv
Ex:
Output:
Demo:
Output: