Keeping format of a pdb file with conditionals

239 Views Asked by At

I'm new in awk, and I'm trying to modify column 3 (with numeration about NR) if column 1 has the word HETATM.

My input file is:

HETATM   25  O   UNL     1      86.047  83.059 103.165  1.00  0.00           O
HETATM   26  N   UNL     1      87.071  82.457 102.433  1.00  0.00           N
HETATM   27  C   UNL     1      91.764  77.729  97.523  1.00  0.00           C
HETATM   28  O   UNL     1      92.740  78.174  98.137  1.00  0.00           O
HETATM   29  H   UNL     1      90.477  80.552  97.677  1.00  0.00           H
CONECT    1    2
CONECT    2    1    3
CONECT    3    2    4    7

The output that I want, it's:

HETATM   25  O25   UNL     1      86.047  83.059 103.165  1.00  0.00           O
HETATM   26  N26   UNL     1      87.071  82.457 102.433  1.00  0.00           N
HETATM   27  C27   UNL     1      91.764  77.729  97.523  1.00  0.00           C
HETATM   28  O28   UNL     1      92.740  78.174  98.137  1.00  0.00           O
HETATM   29  H29   UNL     1      90.477  80.552  97.677  1.00  0.00           H
CONECT    1    2
CONECT    2    1    3
CONECT    3    2    4    7

I'm using this command to maintain the format of the file but I could not. Can you help me please?

awk 'BEGIN{FS=OFS="\t";}{if($1=="HETATM"){$3=$3NR};print $0}' file.pdb

Thanks a lot.

1

There are 1 best solutions below

4
On BEST ANSWER

Using any sed:

$ sed 's/^HETATM *\([^ ]*\) *[^ ]*/&\1/' file
HETATM   25  O25   UNL     1      86.047  83.059 103.165  1.00  0.00           O
HETATM   26  N26   UNL     1      87.071  82.457 102.433  1.00  0.00           N
HETATM   27  C27   UNL     1      91.764  77.729  97.523  1.00  0.00           C
HETATM   28  O28   UNL     1      92.740  78.174  98.137  1.00  0.00           O
HETATM   29  H29   UNL     1      90.477  80.552  97.677  1.00  0.00           H
CONECT    1    2
CONECT    2    1    3
CONECT    3    2    4    7

Original answer:

Assuming your input really is tab-separated as you indicate in your script, you were very, very close:

$ awk 'BEGIN{FS=OFS="\t"} $1=="HETATM"{$3=$3 $2} 1' file
HETATM  25      O25     UNL     1       86.047  83.059  103.165 1.00    0.00    O
HETATM  26      N26     UNL     1       87.071  82.457  102.433 1.00    0.00    N
HETATM  27      C27     UNL     1       91.764  77.729  97.523  1.00    0.00    C
HETATM  28      O28     UNL     1       92.740  78.174  98.137  1.00    0.00    O
HETATM  29      H29     UNL     1       90.477  80.552  97.677  1.00    0.00    H
CONECT  1       2
CONECT  2       1       3
CONECT  3       2       4       7