Question

intersect ranges from a GAlignment object with a GRangesList

0

Entering edit mode

ccanchaya • 0

@ccanchaya-10889

Last seen 8.5 years ago

Hi,

I would like to find the regions of my genome, overlapped by alignments in order to calculate the percentage of CDs that is covered by my GAligments.

reduced_alignment <- reduce(intersect(as(x,"GRanges"),unlist(CDs_list_by_transcript)), ignore.strand=TRUE )

Unfortunately, I dont want to unlist the CDs by transcript because I would like to do my "breadth" percentages on a transcript basis. What I am doing now is to use a lapply to intersect each transcript with all the "reduced_aligment" object using a second "intersect" but it takes too much time and process power.

Any idea how to simplify this last step?

Cheers,

Carlos

intersect grangeslist galignment • 1.2k views

ADD COMMENT • link updated 8.5 years ago by Hervé Pagès 16k • written 8.5 years ago by ccanchaya • 0

score 0 · Answer 1 · 2017-05-19

Hi Carlos,

I would suggest you check coverageByTranscript in the GenomicFeatures package. It will compute the per-transcript (or per-CDS) coverage of your set of reads. Once you have this for your set of CDS (say in cds_cvg, will be an RleList object with 1 integer-Rle per CDS), you can easily get the percentage of bases that are covered for each CDS with something like 100 * sum(cds_cvg != 0) / lengths(cds_cvg).

Hope this helps,

H.