I am in R. I want to extract just the numbers from df1.
I have for example:
df1 <- data.frame( column1 = c("Any[12, 15, 20]", "Any[22, 23, 30]"), column2 = c("Any[4, 17]", "Any[]"), stringsAsFactors = F )
And I want a new df, that takes the integers within the brackets muliples by the row number, and keeps the column information corresponding to it.
e.g. new_df could look like
| Time | Channel |
|---|---|
| 12 | column1 |
| 15 | column1 |
| 20 | column1 |
| 44 | column1 |
| 46 | column1 |
| 60 | column1 |
| 8 | column2 |
| 34 | column2 |
I do not need to preserve any "NA" values, e.g If Any[] is empty. Anyone got any idea if this is possible please? I have ENORMOUS amounts of data in this format, so I cannot really do much manually. Cheers!
I already tried:
new_df$Time <- as.integer(df1$column1)
and that just gave blanks.
I also tried:
new_df$Time <- str_extract_all(new_df$Time, "\\d+" ) %>% lapply(function(x) as.integer(x)) %>% sapply(function(x) if.else(length(x) >0, x, NA) )
which only then returned the first integer within each bracket. e.g.
| Time | Channel |
|---|---|
| 12 | column1 |
| 44 | column1 |
| 8 | column2 |
How it works
This is similar to the approach you took in your initial attempt except I am using
purrr::imapinstead oflapply. The advantage here is thatimapgives you access to the list element name (.y), which is the row number in this case, in addition to the list element (.x). This makes the multiple multiplication step simple.str_extract_allextracts all the numbers from a column and outputs those numbers in a list:imapiterates over this list and does the multiplication:Then the other two pipes are to reshape the data.
Output