I am trying to add a new column with the year information on an existing column. Currently, I have a column titled County_and_Quarter, which has values of both the County in Pennsylvania and the Year and the Quarter ("Adams_2018 Q1"). I am trying to make a new column based only on the year from that column.
My code currently reads
PA_State_County_Level_Sentencing_with_Year <- PA_State_County_Level_Sentencing %>%
mutate(Year = if_else(str_detect(County_and_Quarter, "2013"),
"2013",
County_and_Quarter))
If I then add 2014, however, the 2013 value is then removed from my new Year
column. How do I keep all existing values in my new Year
column?
Rather than detecting each case one at a time, you should extract the year values from the string. A direct approach would use
stringr::str_extract
to look for a regex pattern, like 4 digits in a row:However, for your format
"county name_year quarter"
, a more complete solution might usetidyr::separate
to first break off the county based on the_
separator, and then use it again to separate the year and the quarter based on the space separator. With this you should end up with 3 columns, County, Year, and Quarter, which should make further work convenient. (Note you can specifycols_remove = FALSE
if you want to keep the old columns. The default will drop them.)