str_replace Assigned data ... must be compatible with existing data

114 Views Asked by At

I have a data frame with 1,349,154 million rows and 3 columns -- SKU, count and state.

I run this code to replace all instances of "apples" with "Green Apples" and it works fine.

df$SKU <-str_replace_all(df$SKU, "apples", "Green Apples")

I then split the data and subsetted it using:

hs1_df$sc <- str_split(hs1_df$SKU, fixed(" - "), n = 2, simplify = TRUE)

hs1_df <-subset(df, count==1)

I run the exact same code to replace apples" with "Green Apples" and it errors. out

Assigned data ... must be compatible with existing data.

✖ Existing data has 631,580 rows.

✖ Assigned data has 1,263,160 rows.

"Existing data" matches the row count from the new data frame.

I was expecting the code to run equally well on the original data frame and the new one.

I thought I might have some nulls or NAs in there, so I replaced them with:

df$SKU <- df$SKU %>% replace_na('missing')

I also tried mutate functions and get the same error, so I know I messed something up. I just don't know what.

I see other posts where this error occurs, but it seems like it applies to a wide range of situations.

1

There are 1 best solutions below

0
stomper On

You seem to be mixing up your data frames in your example. Your first line of code is on hs1_df but your second overrides it based on df. However, the problem is that str_split is returning a character matrix which you are able to assign to one variable, but str_replace_all is returning a character vector with two values which you are attempting to assign to one variable. So you are getting an erorr as you have twice as many replacement values as you have places to assign them to. You could fix this as I did below by capturing the results of str_replace_all as a list.

library(stringr)

df <- data.frame(name = "FirstName-LastName")


df$name <- str_split(df$name, fixed("-"), n = 2, simplify = TRUE)


df$name <- list(str_replace_all(df$name, "FirstName", "MiddleName"))

df

                  name
1 MiddleName, LastName