Filter overlapping entries in bed file (2)

181 Views Asked by At

This is a follow-up of my previous question. This time, in case two entries in bed file overlap I would like to keep one of them in the output (randomly selected). Preferably, I look for a modification of the previous answer (bedtools package). For the input:

1   183113  183114  chr1:183113-183240  0   +
1   187286  187287  chr1:187128-187287  0   -
1   187576  187587  chr1:187375-187577  0   -
1   187580  187590  chr1:187379-187577  0   -

I expect to get output like:

1   183113  183114  chr1:183113-183240  0   +
1   187286  187287  chr1:187128-187287  0   -
1   187576  187587  chr1:187375-187577  0   -
1

There are 1 best solutions below

0
On

Are you able to split your coordinates into two separate bed files fileA.bed and fileB.bed instead? Then you could use bedtools intersect with the -u option to do exactly that: (https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html)

bedtools intersect -u -a fileA.bed -b fileB.bed > overlapped_entries.bed