split file into columns with awk

462 Views Asked by At

I have a file which looks like this:

1. result = 1.2.3.4 (1.2.3.4)
   info: [Affected]

2. result = www.addr.com (2.3.4.5)
   info: [not Affected]

And now I want to split it in three columns, example:

1.2.3.4       1.2.3.4   Affected
www.addr.de   2.3.4.5   not Affected

I am using awk for that: cat filename.txt | awk -F "[=()'']" '{print $2 $3 $4}'

but I still not get three columns in a row. How can I fix it? the second question: is there a better alternative than awk?

3

There are 3 best solutions below

0
On BEST ANSWER

You can unset the record separator to read in each block separately, like this:

$ cat file
1. result = 1.2.3.4 (1.2.3.4)
   info: [Affected]

2. result = www.addr.com (2.3.4.5)
   info: [not Affected]
$ awk -F'[]=():[:space:][]+' -v RS= '{print $3, $4, $6 (NF==8?" " $7:"")}' file
1.2.3.4 1.2.3.4 Affected
www.addr.com 2.3.4.5 not Affected

The ternary at the end handles the two different numbers of fields (7 or 8, depending on "Affected" or "not Affected"). If there are 8 fields, then the seventh one is printed after a space, otherwise, nothing is printed.

To achieve a more neatly formatted output, you can use printf instead of print:

$ awk -F'[]=():[:space:][]+' -v RS= '{printf "%-12s%10s   %s%s%s", $3, $4, $6, (NF==8?" " $7:""), ORS}' file
1.2.3.4        1.2.3.4   Affected
www.addr.com   2.3.4.5   not Affected

The format specifiers dictate the width of each field. A - causes the content to be left-aligned. ORS is the Output Record Separator, which is a newline on your platform by default.

In terms of aligning the columns, it depends on whether you're looking for something human- or machine-readable. If you're looking to import this data into a spreadsheet, perhaps you could separate each column using a tab character \t (for example), which could be done by adding -v OFS='\t' to the first version of my answer.

10
On

You need to read the section as a single record, you can do this in GAWK by using RS=(nothing). This reads blocks as records.

awk -vRS= -F"[)(=\n]+" '{print $2 $3 $4}' file

1.2.3.4 1.2.3.4   Affected
www.addr.com 2.3.4.5   not Affected
0
On

Some more awk

Input

$ cat file
1. result = 1.2.3.4 (1.2.3.4)
   Affected

2. result = www.addr.com (2.3.4.5)
   not Affected

Output

$ awk  's{print $0}s=/^[0-9]+\./{ gsub(/[()]/,"");printf ("%s %s", $4,$5);next}' file
1.2.3.4 1.2.3.4   Affected
www.addr.com 2.3.4.5   not Affected

-- Edit -- for revised input

$ cat file
1. result = 1.2.3.4 (1.2.3.4)
   info: [Affected]

2. result = www.addr.com (2.3.4.5)
   info: [not Affected]

Output

$ awk  '{gsub(/[()\[\]]/,"")}s{$1="";print $0}s=/^[0-9]+\./{printf ("%s %s", $4,$5);next}' file
1.2.3.4 1.2.3.4 Affected
www.addr.com 2.3.4.5 not Affected