I am curious if anyone has a good R equivalency for the Stata program
command. I have a bunch of fixed width .asc files that I needed to load up individually because fixed width positions for the same data change year to year. Stata program
command does this very succinctly, managing the changes in position file by file as the program runs.
I am
unclear how to do this in R without loading up each year's data individually then rbind
.
I understand that something like :: ldply(myfiles, read_csv) would do a good job of identically formatted files in a set directory, but, because the conditions of some of the variables change from file to file, I am stuck on needing to write it file by file too.
target data frame would have c("name", "fips", "var1", "var2", "year")
my attempt:
a97 <- read_fwf("/A/A2000.asc", fwf_positions(c(67, 122, 18533, 18563), c(91, 126, 18538, 18568), c("name", "fips", "var1", "var2"))) %>% filter(fips == "12345") %>% mutate(year = 1997)
a00 <- read_fwf("/A/A2004.asc", fwf_positions(c(67, 122, 17982, 18012), c(91, 126, 17987, 18017), c("name", "fips", "var1", "var2"))) %>% filter(fips == "12345") %>% mutate(year = 2000)
rbind(a97, a00)
A Stata program
would create a dataframe of name, fips, var1, var2, filename, year; and then fill it based on file by file positions for var1 and 2.
program import_a
infix str name 67-91 str fips 122-126 var1 `1' var2 `2' ///
using "$\A\\`3'.asc", clear
keep if fips=="12345"
gen year = `4'
append using "$output\a.dta"
save "$output\a.dta", replace
end
import_a 18533-18538 18563-18568 A2000 1997
import_a 17982-17987 18012-18017 A2004 2000
I am not looking for a solution just ideas on how to loop something in R that can do this for read_fwf()
.