Merge three columns in one (linux, python, or perl)

93 Views Asked by At

I have one file (.tsv) that contain variants calling for all the samples. I would like to merge the first three columns into one column:

Example: Original:

file name= variants.tsv > the first three columns that I want to merge are:

lane sampleID Barcode

B31 00-00-NNA-0000 0000

Desired output:

ID

B31_00-00-NNA-0000_0000

what are the recommended methods?

2

There are 2 best solutions below

1
On

One way, with a perl one-liner:

perl -F'\t' -lane '
    if ($. == 1) {
        print join("\t", "ID", @F[3..$#F])
    } else {
        print join("\t", join("_", @F[0,1,2]), @F[3..$#F])
    }' variants tsv

Splits each line into an array (@F) on tabs, and prints out the header and later lines using slices of that array to extract the appropriate elements, which are then joined into delimited strings.

0
On

Starting from this

lane    sampleID    Barcode
B31 00-00-NNA-0000  0000

and using Miller, you can run

mlr --tsv put -S '$ID=$lane."_".$sampleID."_".$Barcode' input.tsv >output.tsv

to have

+------+----------------+---------+-------------------------+
| lane | sampleID       | Barcode | ID                      |
+------+----------------+---------+-------------------------+
| B31  | 00-00-NNA-0000 | 0000    | B31_00-00-NNA-0000_0000 |
+------+----------------+---------+-------------------------+

If you want only the ID field the command is

mlr --tsv put -S '$ID=$lane."_".$sampleID."_".$Barcode' then cut -f ID input.tsv >output.tsv