Iterating over a loop with two vectors of different lengths

38 Views Asked by At

I have two dataframes:

harv<-structure(list(spp = c("Other species (please list species name; i.e. Tarpon)", 
"Other species (please list species name; i.e. Amberjack)", "Red Drum (a.k.a. Redfish or Red)", 
"Other species (please list species name; i.e. Tarpon)", "Other species (please list species name; i.e. Amberjack)", 
"Red Drum (a.k.a. Redfish or Red)", "Other species (please list species name; i.e. Tarpon)"
), number = c(1, 2, 2, 4, 5, 7, 9)), class = "data.frame", row.names = c(NA, 
-7L))

oth<-structure(list(spp = c("Tink", "Chii", "Blue", "Red", "Blue")), class = "data.frame", row.names = c(NA, 
-5L))

I want to iterate over the loop so that every time the spp name contains "Other" it will replace it in order of the oth data frame. Here is what the harv data frame looks like:

                                                       spp number
1    Other species (please list species name; i.e. Tarpon)      1
2 Other species (please list species name; i.e. Amberjack)      2
3                         Red Drum (a.k.a. Redfish or Red)      2
4    Other species (please list species name; i.e. Tarpon)      4
5 Other species (please list species name; i.e. Amberjack)      5
6                         Red Drum (a.k.a. Redfish or Red)      7
7    Other species (please list species name; i.e. Tarpon)      9

When I run through this loop:

for (i in seq_along(oth$spp)) {
  # Find indices where "Other" is present in spp
  indices <- grep("Other", harv$spp)
  # Replace spp values at the found indices with oth$spp[i]
  harv$spp[indices] <- oth$spp[i]
}

It gives me a data frame that only pulls the name "Tink" in the oth$spp[i] vector. The data frame looks like this:

 spp number
1                             Tink      1
2                             Tink      2
3 Red Drum (a.k.a. Redfish or Red)      2
4                             Tink      4
5                             Tink      5
6 Red Drum (a.k.a. Redfish or Red)      7
7                             Tink      9

How do I create a loop so that each "Other" harv$spp is replaced sequentially with what is in the oth vector?

I want to get a dataframe that looks like this, replacing each "Other" with what is in the oth data frame in the same order as that data frame:

spp number
1                             Tink      1
2                             Chii      2
3 Red Drum (a.k.a. Redfish or Red)      2
4                             Blue      4
5                             Red       5
6 Red Drum (a.k.a. Redfish or Red)      7
7                             Blue      9
2

There are 2 best solutions below

0
Onyambu On BEST ANSWER

If you are sure that the number of times 'other' appear in spp is equal to the number of rows of oth df, then do:

harv[grep('Other', harv$spp), 'spp'] <- oth

But this might not hold true hence we have to take the minimum of the two:

idx <- grep('Other', harv$spp)
n <- min(length(idx), nrow(oth))

harv[head(idx,n), 'spp'] <- head(oth, n)
harv

                              spp number
1                             Tink      1
2                             Chii      2
3 Red Drum (a.k.a. Redfish or Red)      2
4                             Blue      4
5                              Red      5
6 Red Drum (a.k.a. Redfish or Red)      7
7                             Blue      9
1
LMc On

No need to loop -- the functions needed to do this in base R are vectorized, meaning that the function will operate on all elements of a vector without needing to loop through and act on each element one at a time:

idx <- which(startsWith(harv$spp, "Other"))
harv$spp[idx] <- oth$spp[seq_along(idx)]

Also be careful of using grep here which takes a regular expression to match a string pattern (maybe this is what you want to do). The way you have written it this will match "Other" anywhere in your string.


It is easiest to understand the issue you have encountered if you step through your loop one element at a time:

i <- 1
indices <- grep("Other", harv$spp)
# [1] 1 2 4 5 7

oth$spp[i]
# [1] "Tink"

harv$spp[indices]
# [1] "Other species (please list species name; i.e. Tarpon)"   
# [2] "Other species (please list species name; i.e. Amberjack)"
# [3] "Other species (please list species name; i.e. Tarpon)"   
# [4] "Other species (please list species name; i.e. Amberjack)"
# [5] "Other species (please list species name; i.e. Tarpon)" 

After the first iteration it is important to note that you have overwritten all values of harv$spp[indices] with "Tink" before the next iteration:

i <- 2
indices <- grep("Other", harv$spp)
# integer(0)

oth$spp[i]
# [1] "Chii"

harv$spp[indices]
# character(0)

Since you have overwritten all values of harv$spp at indices 1 2 4 5 7, grep returns an empty numeric vector because no strings contain "Other". Since you have no indices then using the selection with an empty vector also returns an empty character vector (e.g. harv$spp[integer(0)] will always return character(0) since harv$spp is character). So, none of your remaining iterations do anything.