Parse a string and extract a word delimited by comma and assign value from inside [] brackets

273 Views Asked by At

I need help to parse a string and extract a word delimited by comma and assign value from inside [] brackets. The input string is like this:

KEEP_DFB,?(y/n),[y];
DFB_VERSION,?(1.4.2/1.7.6),[1.4.2]:

and expected output is

KEEP_DFB=y
DFB_VERSION=1.4.2

The closest I could achieve using sed is this:

echo 'KEEP_DFB,?(y/n),[y]:' | sed 's/\([^,]*,\).*,\([^,]*\):.*/\1=\2/'

but it does not give result as expected.

I also tried 'cut' but the same result as above. Using IFS is not allowed for changing delimiter. Can you please help?

7

There are 7 best solutions below

2
On BEST ANSWER

Your were fairly close:

$ printf "%s\n" 'DFB_VERSION,?(1.4.2/1.7.6),[1.4.2]:' 'KEEP_DFB,?(y/n),[y]:' |
> sed 's/\([^,]*\),.*,\[\([^],]*\)][;:].*/\1=\2/'
DFB_VERSION=1.4.2
KEEP_DFB=y
$

The first comma is moved outside the capture. The second capture is preceded by \[ (a literal [ in the data) and followed by a ] (doesn't need a backslash escape because ] is only special when it is part of a character class, though I'd be sorely tempted to add one and it works fine with or without the backslash).

Sundeep noted that there's a semicolon instead of a colon in one of the data lines, but the example data in the echo has a colon rather than a semicolon (which is why I didn't spot the problem on the first pass; I copied the prototype command). That's trivially handled by using [;:] as a character class instead of a direct :.

The negated character class excludes ] and commas — though it isn't clear why commas need to be excluded. It means you wouldn't recognize this as valid:

VERSION_LIST,?(1.2/1.3/1.4/1.7),[1.4,1.7]:
2
On

You didn't say what shell you are going to use, but with most shells, the following approach would work:

# Drop the last two characters
x=${original:0:-2}
# Store the name part
name=${x%%,*}
# Store the value part
value=${x##*\[}

For example, if original contains DFB_VERSION,?(1.4.2/1.7.6),[1.4.2]:, name will contain DFB_VERSION and value will contain 1.4.2.

BTW, why don't you want to modify IFS? Of course you don't want to change it permanently, but modifying it just for one statement, does not affect the rest of the program.

2
On

POSIX shell method, given input file 'foo':

while IFS=',[]' read a b c d e ; do echo "$a${a:+=}$d" ; done < foo

Output:

KEEP_DFB=y
DFB_VERSION=1.4.2
0
On

@Suresh K: Could you please try following and let me know if this helps you.

awk -F, '{match($0,/\[.*\]/);print $1"="substr($0,RSTART+1,RLENGTH-2)}' Input_file

I hope this helps.

3
On

You should try this code. It should work fine.

awk -F"," '{print $1,$3}' OFS="=" file_name | sed -e 's/\[\(.*\)\]./\1/'

This will output the line contained in a file using awk and replacing the delimiter by = and then replace the part starting from [ and ending by ] or any other character by the values inside [].

You could also try this shorter one:

sed -e 's/,.*\[\(.*\)\]./=\1/' file

The output for both is:

KEEP_DFB=y
DFB_VERSION=1.4.2
1
On

I suggest:

sed 's/,.*\[/=/;s/].//' file

Output:

KEEP_DFB=y
DFB_VERSION=1.4.2
0
On
awk -F'[][,]' '{print $1"="$4}' file 

KEEP_DFB=y
DFB_VERSION=1.4.2