does anyone know how to filter data automatically based on date_of_incident from socrata dataset in R in the first step of import to speed up read time?
this is what I have so far
token <- "n15hFiXqJU6DBItiSjA4jWD2U"
PoliceIncidents <- read.socrata("https://www.dallasopendata.com/resource/qv6i-rri7.csv", app_token = token)
#filter police incident data to 2019 to present
PoliceIncidents2019to2020 <- PoliceIncidents %>% filter(servyr > 2018)
here is the source data https://www.dallasopendata.com/Public-Safety/Police-Incidents/qv6i-rri7/data
For big csvs, I like the package vroom from tidyverse. It's a lot faster than read_csv. With vroom, it's often easier to swallow the whole thing, then filter.
This only took like 10 seconds.