Download.file keeping original name or sheet name

56 Views Asked by At

I want to download the following link in R keeping the file name after browser download. http://www.coingecko.com/price_charts/export/1/usd.xls

Notice that the file name will be 'btc-usd-max.xls' if you download the file through browser. However, if I run the following code, the file name will be 'usd.xls'. However, the file name 'btc-usd-max.xls' conveys relevant information. Is it possible to retrieve this 'btc-usd-max' name using R?

link <- 'http://www.coingecko.com/price_charts/export/1/usd.xls'
download.file(url=link, destfile=basename(link), method='auto')

The solution provided in the following topic does not solve the problem. R download.file to a folder keeping original name

The aforementioned solution depends on content-disposition, which does not exists.

library(httr)
hd <- HEAD(link)
filename <- gsub(".*name=", "", headers(hd)$`content-disposition`)

I have tried using HEAD() command from httr package and also downloading the files using the download.file function.

1

There are 1 best solutions below

4
On BEST ANSWER

The displayed file name is in a response header called content-disposition, but only if you get a status 200 from the server. The default user-agent in httr causes the server to return a status 403 that does not return this header, so you need to add a user agent that the server will respond to.

Once this is done it's fairly easy to extract the file name.

library(httr)

url <- 'http://www.coingecko.com/price_charts/export/1/usd.xls'
UA <- "Mozilla/5.0 (Windows NT 6.1; rv:85.0) Gecko/20100101 Firefox/85.0"
res <- GET(url, user_agent(UA))
filename <- strsplit(res$headers$`content-disposition`, '\"')[[1]][2]

Now we have

filename
#> [1] "btc-usd-max.xls"

And we can write the result like this:

writeBin(res$content, filename)

Strangely enough though, the actual binary data appears to be in csv format rather than xls format (I get a warning when I try to open it in Excel, whether downloading in R or directly in the browser). You might prefer to read it as a csv using:

data <- read.csv(text = content(res, 'text'))

Then write data to an Excel sheet named according to the filename.