How do I use Zyte with HTTR in R

69 Views Asked by At

I need to use a rotating proxy IP service and opted to go with Zyte as that's what we used in my former company. I'm having trouble using the Zyte API with R. I've been messing with it for three or four hours and can't seem to find how to enter in all of the API endpoint, target url and api_key information. I'd prefer to use httr but will gladly use curl or RCurl if that's easier.

This is kind of a mish-mash of help from around the internet and ChatGPT:

library(httr)
library(jsonlite)

api_url <- "http://proxy.zyte.com:8011/api/v2/"
api_key <- "API_KEY"
target_url <- all_links.vec[100]
response <- GET(url = api_url,
                add_headers(`Proxy-Authorization` = paste('Basic', base64_enc(api_key)),
                            `targeturl` = target_url))

The closest I've been able to come is accidentally scraping Zyte's website.

Could anyone help me get this working? Link to Zyte's documentation: Zyte Documentation I've paid for the Zyte API, not the smart proxy manager.

I should note, I am ENTIRELY open to using a different service but I am fairly wedded to using R. I'm on the first day of the trial phase for Zyte so no harm is getting rid of it but I'm on year 10 of using R so it's vastly easier for me to use it than switch to Python.

1

There are 1 best solutions below

0
Carl Boneri On

Based off what little info I'm seeing on the help docs, I'm wondering if this might be more useful:


curl \
    --proxy https://api.zyte.com:8014 \
    --proxy-user YOUR_API_KEY: \
    --compressed \
    https://toscrape.com

So in R it would be more like:


httr::GET("whatever_url_to_scrape.com", 
    httr::use_proxy(url = "api.zyte.com", port = 8011, username = "API_KEY")
)

OR MAYBE a POST with user set as denoted here in the docs

?httr::authenticate