How to retrieve titles and prices from web page using rvest

54 Views Asked by At

I am working on a web scraping task using R and rvest.

I am trying to get the names of smartphones as well as their prices from a store web page.

I am using next code:

library(rvest)
library(dplyr)
#Code
first_page <- read_html('https://catalogo.claro.com.ec/postpago/catalogo?utm_source=catalogo&utm_medium=menu-header&utm_campaign=equipos-celulares&utm_content=navigation')
links <- first_page %>% html_nodes(xpath="//*[@id='catalogo--productos']/div/div/section/article[1]/div/h3") %>%
  html_text()

When I tried previous code I got empty character(0) for links.

I have checked the source code from web page:

enter image description here

And set the xpath according to it but it is not working. Is there any way to get the titles and price for each smartphone using rvest?

Many thanks.

1

There are 1 best solutions below

0
Grzegorz Sapijaszko On BEST ANSWER

My attempt with {rvest}:

url <- "https://catalogo.claro.com.ec/postpago/catalogo?utm_source=catalogo&utm_medium=menu-header&utm_campaign=equipos-celulares&utm_content=navigation"

ses <- rvest::read_html_live(url)

ses$view() 

phones <- ses |>
  rvest::html_elements(xpath = '//*[@id="catalogo--productos"]/div/div/section') |>
  rvest::html_elements("h3") |>
  rvest::html_text()
  
prices <- ses |>
  rvest::html_elements(".price-new") |>
  rvest::html_text()

cbind(phones, prices)
#>       phones                                                                                           
#>  [1,] "MOTOROLA EDGE NEO 40 (256GB)"                                                                   
#>  [2,] "SAMSUNG GALAXY S24 ULTRA (512GB) + BUDS 2"                                                      
#>  [3,] "SAMSUNG GALAXY S24 PLUS (512GB) + BUDS 2"                                                       
#>  [4,] "SAMSUNG GALAXY S24 (256GB) + BUDS 2 ONYX + TRAVEL ADAPTER 25W"                                  
[...]
#> [73,] "IPHONE 11 (64GB)"                                                                               
#>       prices     
#>  [1,] "$ 453,60" 
#>  [2,] "$ 1868,44"
#>  [3,] "$ 1286,88"
#>  [4,] "$ 1062,88"
[...]
#> [73,] "$ 588,00"

Created on 2024-02-20 with reprex v2.1.0