RSelenium - Not able to perform operations after hanging on a website

75 Views Asked by At

I am webscraping using RSelenium. After hanging on a website for some minutes, I get the following error message:

"_Error in .Call(R_curl_fetch_memory, enc2utf8(url), handle, nonblocking) : reached elapsed time limit"

It always hang during the following operation:

remDr$navigate(URL)

After applying an error handler, I want to close the current window and start a new webdriver. Unfortunately, I can't connect to the current window to perform any operations, so I also can't find a way to close the current window because the operation was aborted because of an application callback. I suspect that the connection is reset.

Since the website I am scraping hangs quite often, after a while tens of windows are open, which slows down everything.

I don't know if it is relevant, but the setting for the webdriver is as follows:

 prefs = list("profile.managed_default_content_settings.images" = 2L, "profile.default_content_settings.popups" = 0L,
               "excludeSwitches","disable-popup-blocking"=TRUE) 
  cprof <- list(chromeOptions = list(prefs = prefs, w3c=FALSE))
  remDr <- remoteDriver(browserName = 'chrome', extraCapabilities = cprof, port=4444L)

Every help is appreciated.

I have attempted the following ways to close the window, but nothing helps:

  • remDr$close()
  • remDr$quit()
  • remDr$closeWindow()
  • remDr$closeall()

I get a message that no connection could be made to server.

1

There are 1 best solutions below

1
HoelR On

If you inspect the network section, you will see that the site fetches data from their API. You can do the same as such:

library(tidyverse)
library(httr2)

df <- "https://prod-public-api.livescore.com/v1/api/app/date/soccer/20240302/1?countryCode=NO&locale=en&MD=1" %>% 
  request() %>% 
  req_perform() %>%  
  resp_body_json(simplifyVector = TRUE) %>% 
  pluck("Stages") %>% 
  unnest(Events) %>% 
  unnest(c(T1, T2), names_sep = "_")

# A tibble: 1,291 × 50
   Sid   Snm        Scd   badgeUrl firstColor Cnm   Csnm  Ccd   CompId CompN CompD CompST   Scu Eid   Pids$`12` Media$`12`
   <chr> <chr>      <chr> <chr>    <chr>      <chr> <chr> <chr> <chr>  <chr> <chr> <chr>  <int> <chr> <chr>     <list>    
 1 14414 Premier L… prem… 2023-pr… 3F1152     Engl… Engl… engl… 65     Prem… Engl… Engla…     0 9682… SBTE_1_3… <df>      
 2 14414 Premier L… prem… 2023-pr… 3F1152     Engl… Engl… engl… 65     Prem… Engl… Engla…     0 9682… SBTE_1_3… <df>      
 3 14414 Premier L… prem… 2023-pr… 3F1152     Engl… Engl… engl… 65     Prem… Engl… Engla…     0 9682… SBTE_1_3… <df>      
 4 14414 Premier L… prem… 2023-pr… 3F1152     Engl… Engl… engl… 65     Prem… Engl… Engla…     0 9682… SBTE_1_3… <df>      
 5 14414 Premier L… prem… 2023-pr… 3F1152     Engl… Engl… engl… 65     Prem… Engl… Engla…     0 9682… SBTE_1_3… <df>      
 6 14414 Premier L… prem… 2023-pr… 3F1152     Engl… Engl… engl… 65     Prem… Engl… Engla…     0 9682… SBTE_1_3… <df>      
 7 14414 Premier L… prem… 2023-pr… 3F1152     Engl… Engl… engl… 65     Prem… Engl… Engla…     0 9682… SBTE_1_3… <df>      
 8 14500 LaLiga     lali… 2023-sp… 8B0D11     Spain Spain spain 75     LaLi… Spain Spain      0 9765… SBTE_1_3… <df>      
 9 14500 LaLiga     lali… 2023-sp… 8B0D11     Spain Spain spain 75     LaLi… Spain Spain      0 9765… SBTE_1_3… <df>      
10 14500 LaLiga     lali… 2023-sp… 8B0D11     Spain Spain spain 75     LaLi… Spain Spain      0 9765… SBTE_1_3… <df>      
# ℹ 1,281 more rows
# ℹ 38 more variables: Pids$`29` <chr>, $`8` <chr>, Media$`29` <list>, $`32` <list>, T1_Nm <chr>, T1_ID <chr>,
#   T1_Img <chr>, T1_NewsTag <chr>, T1_Abr <chr>, T2_Nm <chr>, T2_ID <chr>, T2_Img <chr>, T2_NewsTag <chr>, T2_Abr <chr>,
#   Eps <chr>, Esid <int>, Epr <int>, Ecov <int>, ErnInf <chr>, Et <int>, Esd <dbl>, EO <int>, EOX <int>, LS6 <int>,
#   Spid <int>, Pid <int>, Tr1 <chr>, Tr2 <chr>, Tr1OR <chr>, Tr2OR <chr>, Trh1 <chr>, Trh2 <chr>, Ewt <int>,
#   seriesInfo <df[,4]>, Awt <int>, Trp1 <chr>, Trp2 <chr>, secondColor <chr>
# ℹ Use `print(n = ...)` to see more rows