I am a beginner in R Programming.
I would like to scrape football data from Squawka and place these in a dataframe in order to conduct analyses (newborn hobby of Football Analytics), more precisely from these kind of pages: http://eredivisie.squawka.com/willem-ii-vs-psv/10-08-2014/dutch-eredivisie/matches.
On Stack Overflow I found a thread about how to conduct this: how to scrape this squawka page?.
Unfortunately, when I implement the code (see below) that is given in the above-mentioned thread for processing XML attributes/data into a data frame, I receive the following error message:
"Error in (function (..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = default.stringsAsFactors()) : numbers of columns of arguments do not match”
data <- lapply(example, function(x){
if(length(x['event']) > 0){
res <- lapply(x['event'], function(y){
matchAttrs <- as.list(xmlAttrs(y))
matchAttrs$start <- xmlValue(y['start']$start)
matchAttrs$end <- xmlValue(y['end']$end)
matchAttrs
})
return(do.call(rbind.data.frame, res))
}
}
)
The outcome should be something similar like this:
player_id mins secs minsec team type start end
event 531 4 39 279 44 Failed 73.1,87.1 97.9,49.1
event5 311 6 33 393 31 Failed 92.3,13.1 93.0,31.0
event1 376 8 57 537 31 Failed 97.7,6.1 96.7,16.4
event6 311 13 50 830 31 Failed 99.5,0.5 94.9,42.6
event11 311 14 11 851 31 Failed 99.5,0.5 93.1,51.0
event7 311 17 41 1061 31 Failed 99.5,99.5 92.6,50.1
I have tried several other solutions that I found on Stack Overflow that have dealt with similar situations, but till now I did not manage to come up with a proper solution.