R - Can't access CPS data using lodown package

49 Views Asked by At

The lodown packages works great for me for the most part - was able to download ACS and CES data without issue. But when I try to use it to access CPS data, I get the following output:

lodown( "cpsbasic" , output_dir = file.path( path.expand( "~" ) , "CPSBASIC" ) )
building catalog for cpsbasic

 Error in rvest::html_table(xml2::read_html(cps_ftp), fill = TRUE)[[2]] : 
  subscript out of bounds

Tried a fresh install of R and the packages involved, but I still get the same error. I think it has something to do with the Census updating their website since the package was last updated, but I'm not clear on what the specific problem is.

I did dig up the install files for the package. The specific lines of the code at issue is below:

cps_ftp <- "https://www.census.gov/data/datasets/time-series/demo/cps/cps-basic.html"

cps_table <- rvest::html_table( xml2::read_html( cps_ftp ) , fill = TRUE )[[2]]

Not sure how active the developer of the package is in updating anymore, so I don't know that an update will be coming anytime soon. Any ideas?

1

There are 1 best solutions below

0
On

We can download both .csv files in cps_ftp by,

library(rvest)
library(stringr)

#get links of csv files
links = 'https://www.census.gov/data/datasets/time-series/demo/cps/cps-basic.html' %>% read_html() %>% 
html_nodes('.uscb-layout-align-start-start') %>%  html_nodes('a') %>% html_attr('href')
#filter the links
 csv_links= links %>% str_subset('csv') %>% paste0('https:', .)

#read the csv files
csv_files = lapply(csv_links, read_csv)