Entering edit mode
dear all!
I wrote a function to stack IRanges
or GRanges
but it is helplessly slow when it comes to "real life" data (i.e. reads from RNA-seq). I know I could use the coverage
function, but I did really want to get the individual ranges stacked in different lines of non-overlapping ranges. The code and an example are shown below.
I would be grateful if someone knows a more efficient way to do this, but I'm afraid I'll have to use C code in the end...
thanks in advance!
require( IRanges ) stackRanges <- function( x ){ x <- x[ order( start( x ) ) ] Overlaps <- as.matrix( findOverlaps( x, x ) ) Index <- list() Which.ranges <- 1:length( x ) ## repeatedly go through the ranges and determine those not directly ## overlapping, these are put into the "first line" and the ##evaluation is repeated for all remaining ranges repeat{ Which.ranges <- Which.ranges[ !( Which.ranges %in% unlist( Index ) ) ] if( length( Which.ranges )==0 ){ ## nothing left... break } CurrentLine <- c() for( i in Which.ranges ){ ## check if any subject for query i is smaller than i ## if yes: don't add, otherwise, add if( !( any( Overlaps[ Overlaps[ , 1 ]==i, 2 ] %in% CurrentLine ) ) ) CurrentLine <- c( CurrentLine, i ) } Index <- c( Index, list( CurrentLine ) ) } res <- lapply( Index, function( z ){ return( x[ z ] ) } ) return( res ) } plotRangesList <- function( x, height=1, ybottom=0 ){ X.l <- min( unlist( lapply( lapply( x, range ), start ) ) ) X.u <- max( unlist( lapply( lapply( x, range ), end ) ) )+1 plot( 3, 3, pch=NA, xlim=c( X.l, X.u ), ylim=c( ybottom, ( ybottom+length( x )*height) ) ) for( i in 1:length( x ) ){ ybot <- ybottom + ( i-1 ) * height rect( xleft=start( x[[ i ]] ), xright=end( x[[ i ]] )+1, ybottom=rep( ybot, length( x[[ i ]] ) ), ytop=rep( (ybot+height), length( x[[ i ]] ) ), col="grey" ) } } IR2 <- IRanges( start=c( 8, 4, 9, 11, 16, 18, 19, 25 ), end=c( 10, 12, 10, 23, 18, 24, 23, 27 ) ) plotRangesList( stackRanges( IR2 ) )
That's perfect!
In the meantime I've implemented it in C, but your call is faster! thanks!
Could just use `split` instead of `splitAsList`. Also, the OP might benefit from using an existing plotting package like ggbio or Gviz.