Merge dataframes based on overlapping genomic ranges
I have two files:
anno
chromosome position functionGVS
1 chr22 16050036 intergenic
2 chr22 16050039 intergenic
3 chr22 16050094 intergenic
4 chr22 16050097 intergenic
5 chr22 16050109 intergenic
6 chr22 16050115 intergenic
huvec
chr start end function
1 chr22 16050000 16051244 R
2 chr22 16051244 16051521 T
3 chr22 16051521 16060433 R
4 chr22 16060433 16060582 T
5 chr22 16060582 16080564 R
6 chr22 16080564 16082420 T
I am trying to find overlapping regions such that the anno$position should
fall within the range of huvec$start & huvec$end. Here is my code:
gr.huvec = with(huvec, GRanges(V1, IRanges(start=V2,end=V3)))
gr.anno <- GRanges(seqnames=anno$chromosome,
ranges=IRanges(start=anno$position, width=1))
hits = findOverlaps(gr.huvec,gr.anno)
My question is that now, after I have the query hits & subject hits, how
can I assign huvec$function to anno based on overlapping regions. So in my
case, each position in anno$position overlaps with the first start & end
values of huvec and so I want to assign the associated huvec$function i.e.
'R' to a new column in anno. Any suggestions?
No comments:
Post a Comment