R use regex to select all fields in data.frame

101 Views Asked by At

I need to save a data.table with write.table in R. The problem is that some values (downloaded from the internet) have a single ". I can't choose a different quote character, as I can in read.table (what's gross, I think). So I've read about using gsub() to select all fields, and replace them adding a different quotation mark, and finally using quote=F in write.table (with sep="\t").

Let's say that's my table:

field1  field2  field3
valueA  valueB  valueC
valueD  valueE  valueF
valueG  value\"H    valueI

Because of the \" in value\"H I have problems with quotation, and need a different quotation mark, a character that I'm sure won't appear anywhere else in the file, say, a chinese character. So, I want to produce this with gsub:

乃field1乃    乃field2乃    乃field3乃
乃valueA乃    乃valueB乃    乃valueC乃
乃valueD乃    乃valueE乃    乃valueF乃
乃valueG乃    乃value\"H乃  乃valueI乃

But how do I select all fields with gsub()? I can't find the correct regex for that. Thanks in advance!

2

There are 2 best solutions below

0
On BEST ANSWER

You can try paste

 df1[] <- lapply(df1, function(x) paste0('乃', x, '乃'))
 df1
 #   field1      field2     field3
 #1 乃valueA乃  乃valueB乃 乃valueC乃
 #2 乃valueD乃  乃valueE乃 乃valueF乃
 #3 乃valueG乃 乃value"H乃 乃valueI乃

data

 df1 <- structure(list(field1 = c("valueA", "valueD", "valueG"), 
 field2 = c("valueB", 
 "valueE", "value\"H"), field3 = c("valueC", "valueF", "valueI"
 )), .Names = c("field1", "field2", "field3"), row.names = c(NA, 
 -3L), class = "data.frame")
0
On

Just for completeness (akrun's version via paste is more appropriate here), this is one using gsub:

df <- read.table(text='field1 field2 field3
                       valueA valueB valueC
                       valueD valueE valueF
                       valueG value\"H valueI')

as.data.frame( lapply(df, function(x) gsub("(.*)","乃\\1乃",x)) )

#          V1          V2        V3
# 1 乃field1乃  乃field2乃 乃field3乃
# 2 乃valueA乃  乃valueB乃 乃valueC乃
# 3 乃valueD乃  乃valueE乃 乃valueF乃
# 4 乃valueG乃 乃value"H乃 乃valueI乃