Add file name as a new column with awk

830 Views Asked by At

First of all existing questions didn't solve my problem that's why I am asking again.

I have two txt files temp.txt

adam    12
george  15
thomas  20

and demo.txt

mark    8
richard 11
james   18

I want to combine them and add a 3rd column as their file names without extension, like this:

adam    12   temp
george  15   temp
thomas  20   temp
mark    8    demo
richard 11   demo
james   18   demo

I used this script:

for i in $(ls); do name=$(basename -s .txt $i)| awk '{OFS="\t";print $0, $name} ' $i; done

But it yields following table:

mark    8   mark    8
richard 11  richard 11
james   18  james   18
adam    12  adam    12
george  15  george  15
thomas  20  thomas  20

I don't understand why it gives the name variable as the whole table.

Thanks in advance.

4

There are 4 best solutions below

2
On BEST ANSWER

First, you need to unmask $name which is inside the single quotes, so does not get replaced by the filename from the shell. After you do that, you need to add double quotes around $name so that awk sees that as a string:

for i in $(ls); do name=$(basename -s .txt $i); awk '{OFS="\t";print $0, "'$name'"} ' $i; done
0
On

Awk has no access to Bash's variables, or vice versa. Inside the Awk script, name is undefined, so $name gets interpreted as $0.

Also, don't use ls in scripts, and quote your shell variables.

Finally, the assignment of name does not print anything, so piping its output to Awk makes no sense.

for i in ./*; do
    name=$(basename -s .txt "$i")
    awk -v name="$name" '{OFS="\t";print $0, $name}' "$i"
done

As such, the basename calculation could easily be performed natively in Awk, but I leave that as an exercise. (Hint: sub(regex, "", FILENAME))

0
On

awk has a FILENAME variable whose value is the path of the file being processed, and a FNR variable whose value is the current line number in the file; so, at FNR == 1 you can process FILENAME and store the result in a variable that you'll use afterwards:

awk -v OFS='\t' '
    FNR == 1 {
        basename = FILENAME
        sub(".*/", "", basename)      # strip from the start up to the last "/"
        sub(/\.[^.]*$/, "", basename) # strip from the last "." up to the end
    }
    { print $0, basename }
' ./path/temp.txt ./path/demo.txt
adam    12   temp
george  15   temp
thomas  20   temp
mark    8    demo
richard 11   demo
james   18   demo
0
On

Using BASH:

for i in temp.txt demo.txt ; do  while read -r a b ; do printf "%s\t%s\t%s\n" "$a" "$b" "${i%%.*}" ; done <"$i" ; done

Output:

adam    12  temp
george  15  temp
thomas  20  temp
mark    8   demo
richard 11  demo
james   18  demo

For each source file read each line and use printf to output tab-delimited columns including the current source file name without extension via bash parameter expansion.