How can I scrape all content from each "option" of a "select" field of HTML with R?

210 Views Asked by At

I'm trying to use rvest package for web scraping an website.

This link will be used as an example: https://www.globalinnovationindex.org/analysis-indicator

The objective is to scrape the tables from all years (select id="ctl29_lstYear") and all indexes (select id="ctl29_lstIndex"). I already have a chunk that scrape and format thoose tables and turn them into lists (and yes... they are not an html <table>), but I can't use follow_link() or set_values() to navigate through the options of years and indexes, and scrape them all.

Let's use a single pair of "options" for this example (year="2013" and index=" Innovation Efficiency Ratio"):

So, I've looked at the rvest::set_values() documentation and I found this example:

    search <- html_form(read_html("http://www.google.com"))[[1]]
    set_values(search, q = "My little pony")

And then I tried this:

    > session<-html_form(read_html("https://www.globalinnovationindex.org/analysis-indicator"))[[1]]
    > set_values(session,list(ctl29$lstYear = "2013",ctl29$lstIndex="Innovation Efficiency Ratio"))
    Error: unexpected '=' in "set_values(session,list(ctl29$lstYear ="

Why it was unexpected the '=' after the name of the fields that I want to modify? Do the set_values() is the best option for this kind of problem?

0

There are 0 best solutions below