How to get findOverlapped region?

343 Views Asked by At

Hi i am working with GRanges and finding the overlaps using findOverlaps function of IRanges. I am getting the hits of which query and subject are overlapped,but I want to also have the coordinates of query and subject where they are overlapped and so I can retrieve the sequence of it.

How can get the coordinates of both subject and query where they are overlapped. I am using following function :

library(GenomicRanges)
library(regioneR) # toGRanges

fo <- findOverlaps(query = toGRanges(df1),subject =  toGRanges(df2),type = "within")
df1 <- structure(list(df1c = c("chr2", "chr2", "chr2", "chr2"), df1c2 = c(2800, 
3600, 3719, 3893), df1c3 = c(3270, 4152, 5092, 4547)), class = "data.frame", row.names = c(NA, 
-4L))

df2 <- structure(list(df2c = c("chr2", "chr2", "chr2", "chr2", "chr2L"
), df2c2 = c(263, 342, 424, 846, 1030), df2c3 = c(20091, 17222, 
2612, 4265, 11575)), class = "data.frame", row.names = c(NA, 
-5L))


The expected output should be like 

chr  CoDF1     CoDF2 
 1   100-200   90-210
 1  150-280   100-285

CoDF1 = Coordinates of df1 file where its overlapped with df2 reads
CoDF2 = Coordinates of df1 file where its overlapped with df1 reads
1

There are 1 best solutions below

5
Basti On BEST ANSWER

You'd better use intersect() :

> intersect(toGRanges(df1),toGRanges(df2))

GRanges object with 2 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr2 2800-3270      *
  [2]     chr2 3600-5092      *
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

But pay attention that your data.frames colnames are not correct to create GRanges object, they should be seqnames/start/end

EDITED :

To see all intersections of all coordinates:

intersection = findOverlaps(query = toGRanges(df1), subject = toGRanges(df2), type = "any")
df = data.frame(df1[queryHits(intersection),], df2[subjectHits(intersection),])
df
    seqnames start  end seqnames.1 start.1 end.1
1       chr2  2800 3270       chr2     263 20091
1.1     chr2  2800 3270       chr2     342 17222
1.2     chr2  2800 3270       chr2     846  4265
2       chr2  3600 4152       chr2     263 20091
2.1     chr2  3600 4152       chr2     342 17222
2.2     chr2  3600 4152       chr2     846  4265
3       chr2  3719 5092       chr2     263 20091
3.1     chr2  3719 5092       chr2     342 17222
3.2     chr2  3719 5092       chr2     846  4265
4       chr2  3893 4547       chr2     263 20091
4.1     chr2  3893 4547       chr2     342 17222
4.2     chr2  3893 4547       chr2     846  4265