find out filename from URL

41 Views Asked by At

I'm currently programming a download manager, which has a function of adding links from clipboard.

Now I need to write a method to find out the filename of the file which is in the link, e.g. a method which outputs the filename: "https://example.com/document.doc" outputs "document.doc" "https://example.com/document.doc?smth=123" outputs "document.doc" "https://exmaple.com/download?file=123" outputs the filename of the file behind this link

I tried, but i only works for the first link:

string[] uriparts = i.Split('/');
string filename = uriparts[uriparts.Length - 1];
3

There are 3 best solutions below

0
Alexander Burov On

You can use Uri class to work with an URLs, it will extract separate parts like host, protocol, path, query string, etc. Then you can use Path class to deal with path and extract filename from it:

var url = new Uri(i);
var fileName = Path.GetFileName(url.AbsolutePath);

However

In most cases you cannot get the filename by just looking at the URL. And even if URL looks like it includes actual filename (e.g. https://example.com/download/document.doc) you can't guarantee that when you follow this URL, you'll download the document.doc file.

The actual file name is returned in the Content-Disposition header of the response when you navigate the URL.

So you can try to get it from the URL initially, but you should be ready that it might not include filename at all, and that server could return you different filename when you execute the request.

And, as a fallback, you might need to generate your own filename (in case server doesn't return one). For example (here I'm using MimeTypes nuget package to map MediaType into file extension when generating file name):

var url = new Uri(urlString);

string GenerateFileName(HttpResponseMessage response)
{
    string? extension = response.Content.Headers.ContentType is { MediaType: { } mediaType }
        ? MimeTypes.GetMimeTypeExtensions(mediaType).FirstOrDefault()
        : null;

    return $"Unnamed{extension}";
}

using (var httpClient = new HttpClient())
{
    var response = await httpClient.GetAsync(url);
    string fileName = response.Content.Headers.ContentDisposition is { FileName: not null } disposition
        ? disposition.FileName
        : (Path.GetFileName(url.AbsolutePath) ?? GenerateFileName(response));
}
0
Fabio On

Your method should proceed as follows:

  1. Extract the path of the URL and get the file name
  2. If the file name is empty, try to get the 'file' query parameter

The code could look something like this:

string GetFileNameFromUrl(string url)
{
    // Get the filename from the URL
    string fileName = Path.GetFileName(new Uri(url).LocalPath);
    
    // If the filename is empty, try to get it from query parameters
    if (string.IsNullOrEmpty(fileName))
    {
        string queryString = new Uri(url).Query;
        if (!string.IsNullOrEmpty(queryString))
        {
            // Parse query string to get the filename
            var queryParams = HttpUtility.ParseQueryString(queryString);
            fileName = queryParams["file"];
        }
    }

    return fileName;
}

Useful resources:

0
quyentho On

Based on your description, try this:

public string ExtractFileNameFromUrl(string url)
{
    Uri uri = new Uri(url);
    string query = uri.Query;
    var queryParams = HttpUtility.ParseQueryString(query);

    if (queryParams.HasKeys() && queryParams.AllKeys.Contains("file"))
    {
        return queryParams["file"];
    }

    string fileName = uri.AbsolutePath.TrimStart('/');

    return fileName;
}