I am getting a comma delimited file with double quotes to string and date fields. we are getting " and new line feeds in string columns like below.
"1234","asdf","with"doublequotes","new line
feed","withmultiple""doublequotes"
want output like
"1234","asdf","withdoublequotes","new linefeed","withmultipledoublequotes"
I have tried
sed 's/\([^",]\)"\([^",]\)/\1\2/g;s/\([^",]\)""/\1"/g;s/""\([^",]\)/"\1/g' < infile > outfile
its removing double quotes in string and removing last double quote like below
"1234","asdf","withdoublequotes","new line
feed","withmultiple"doublequotes
is there a way to remove " and new line feed comes in between ", and ,"
Your substitutions for two consecutive quotes didn't work because they are placed after the substitution for a sole quote, when only one of the two is left.
We could remove " by repeated substitutions (otherwise a quote inserted by the substitution would stay) and new line feed by joining the next input line if the current one's end is no quote: