As a third part of my subsetByOverlaps with Multimapping, I would to further manipulate my GRangesList object. I have created a GRangesList objaect after subsetting my intensities to only the regions of the genes in the gtf file.
This gave me a GRangesList object looks like that:
GRangesList object of length 7126: $15S_rRNA GRanges object with 1649 ranges and 1 metadata column: seqnames ranges strand | score <Rle> <IRanges> <Rle> | <numeric> [1] MT [6546, 6546] * | 48.0121515363538 [2] MT [6547, 6547] * | 48.0456043815372 [3] MT [6548, 6548] * | 48.0865514607159 [4] MT [6549, 6549] * | 48.1089269541849 [5] MT [6550, 6550] * | 48.1350452590915 ... ... ... ... . ... [1645] MT [8190, 8190] * | 145.573432997664 [1646] MT [8191, 8191] * | 145.573432997664 [1647] MT [8192, 8192] * | 145.573432997664 [1648] MT [8193, 8193] * | 145.573432997664 [1649] MT [8194, 8194] * | 118.114351592435 ... <7125 more elements> ------- seqinfo: 17 sequences from an unspecified genome
The first 10 elements of the list can be found here (it is too big to be added here).
I would like to know, if there a way to summarize the scores into bins.The GRanges in the list are each of different sizes.
>unlist(lapply(gr.list[1:10], length)) 15S_rRNA 21S_rRNA HRA1 ICR1 LSR1 NME1 PWR1 Q0010 Q0017 Q0032 1649 4439 564 3199 1175 340 941 387 162 291
I would like to know if it's possible to summarize all elements in each object of the list separately into e.g. 100 bins with the average score value in the metadata column.
The number of bins should be fixed and identical to all objects in the GRangesList, independent of how many ranges were in the original GRanges. In other word I would like to create for each gene (=GRanges object) in the list a new GRanges object of 100 ranges. The score of each of the ranges would be calculated from the summarized scores in the original GRanges object.
I would appreciate any help in doing so.
Thanks in advance Assa
+1 more. This is amazing, how you can fit anything in R. Thanks a lot for the explanation.
Thanks for the detailed solution, Michael.
I have similar data as Assa, except that my "scores" are "frequency".
I get an error when trying to attach the scores back to each tile, because my agg have lesser rows than tiles.gr:
Some of the tiles have zero frequency as my SNPs are scattered across the chromosome:
Is there a way to include the zero scores? Or maybe drop the zero tiles?
Thanks
Jenny
Maybe something like
Yup that works perfectly, thanks!