I'm trying to use rvest
package for web scraping an website.
This link will be used as an example: https://www.globalinnovationindex.org/analysis-indicator
The objective is to scrape the tables from all years (select id="ctl29_lstYear") and all indexes (select id="ctl29_lstIndex"). I already have a chunk that scrape and format thoose tables and turn them into lists (and yes... they are not an html <table>
), but I can't use follow_link()
or set_values()
to navigate through the options of years and indexes, and scrape them all.
Let's use a single pair of "options" for this example (year="2013" and index=" Innovation Efficiency Ratio"):
So, I've looked at the rvest::set_values()
documentation and I found this example:
search <- html_form(read_html("http://www.google.com"))[[1]]
set_values(search, q = "My little pony")
And then I tried this:
> session<-html_form(read_html("https://www.globalinnovationindex.org/analysis-indicator"))[[1]]
> set_values(session,list(ctl29$lstYear = "2013",ctl29$lstIndex="Innovation Efficiency Ratio"))
Error: unexpected '=' in "set_values(session,list(ctl29$lstYear ="
Why it was unexpected the '=' after the name of the fields that I want to modify? Do the set_values()
is the best option for this kind of problem?