Question: loop over IRanges spaces
Yvan60 wrote:
Hello, I tried your code below, but and iI am not able to create the scoreRleList as weight could be only a integer list. > read_gff function(file="gff3.txt") { gff <- read.delim(file,header=FALSE) colnames(gff) <- c("seqname", "source", "feature", "start", "end", "score", "strand","frame", "comments") return(gff) } > library(IRanges) Attaching package: 'IRanges' The following object(s) are masked from 'package:base': cbind, Map, mapply, order, paste, pmax, pmax.int, pmin, pmin.int, rbind, rep.int, table > gff<-read_gff("/home/yvan/Lundalm/R/projects/coen/gff_original/50Lines .gff") > rd<-RangedData(IRanges(gff$start,width=1),score=gff$score,start=gff$st art,space=gff$seqname) > scoreRleList<-coverage(rd, weight ="score",width = as.list(table(rd$space))) Error in .local(x, shift, width, weight, ...) : 'weight' must be a non-empty list of integers > Can I change/force the type of 'weight' ? In a different approach, I try to use different views on the data set. This was motivated by the fact that the difference on the start of each probes could indicate either a change of scaffold or a gap inside a scaffold. > first_view<-slice(diff(rd$start), lower=10, upper=200) > first_view Views on a 60-integer XInteger subject subject: 28 24 25 37 23 24 32 34 25 23 63 ... 24 35 24 29 31 28 19 29 27 29 37 views: start end width [1] 1 19 19 [28 24 25 37 23 24 32 34 25 23 63 21 36 27 19 33 33 24 32] [2] 21 39 19 [ 31 22 34 18 36 25 ... 31 108 34 28 26 23 33] [3] 41 60 20 [32 28 26 20 33 25 30 30 24 ... 29 31 28 19 29 27 29 37 22] > so here the three views are the three scaffold. but if I want to change the subject of the view to the score > second_view<-Views(subject=rd$score,start=first_view$start,end=first_v iew$end) Error in function (classes, fdef, mtable) : unable to find an inherited method for function "Views", for signature "numeric" > Again can I change the expected type of the signature in the Views function. Thanks again for your help. On 04/26/2010 09:08 PM, Patrick Aboyoun wrote:
> Yvan,
> It appears to me that you are trying to perform two conflicting
> activities:
>
> 1) Calculate the running sum of a metric over an annotated sequence
> (as evidenced by your aggregate function call)
> 2) Find the sum for a metric across specified intervals on the
> annotated sequence (as evidenced by your desire to assign the
> aggregated sums into an existing RangedData object)
>
> Taking a step back, I am guessing that you are trying to transform
> something akin to a UCSC bed file into something else that is UCSC bed
> file like. If you are using rtracklayer, this means your initial data
> are stored in a RangedData object. To create a RangedData object
> containing the running sum of a values column from an initial
> RangedData object, I recommend:
>
> 1) Creating an RleList object from the RangedData object using the
> coverage function. Make sure to specify the metric of interest in the > weight argument to coverage. > 2) Using the runsum function on the RleList object to calculate your > running sums. > 3) Creating a RangedData object from the RleList object in step 2 > using as(<<obj>>, "RangedData") > > Here is an example: > > > # Step 1: create an RleList representation of the metric > > rd <- RangedData(IRanges(start = c(5, 10, 15, 2, 4, 8), end = c(7, > 14, 21, 3, 6, 9)), > score = 1:6, space = rep(c("A", "B"), each = 3)) > > scoreRleList <- coverage(rd, weight = "score", width = list(A = 30, > B = 10)) > > scoreRleList > SimpleRleList of length 2 >$A > 'integer' Rle of length 30 with 6 runs > Lengths: 4 3 2 5 7 9 > Values : 0 1 0 2 3 0 > > $B > 'integer' Rle of length 10 with 6 runs > Lengths: 1 2 3 1 2 1 > Values : 0 4 5 0 6 0 > > > # Step 2: calculate the running sums > > scoreRunsum <- runsum(scoreRleList, k = 3, endrule = "constant") > > scoreRunsum > SimpleRleList of length 2 >$A > 'integer' Rle of length 30 with 15 runs > Lengths: 3 1 1 1 1 1 1 1 3 1 1 5 1 1 8 > Values : 0 1 2 3 2 1 2 4 6 7 8 9 6 3 0 > > $B > 'integer' Rle of length 10 with 7 runs > Lengths: 2 1 1 1 1 1 3 > Values : 8 13 14 15 10 11 12 > > > > # Step 3: Create a RangedData representation of the running sums > > rdRunsum <- as(scoreRunsum, "RangedData") > > rdRunsum > RangedData with 22 rows and 1 value column across 2 spaces > space ranges | score > <character> <iranges> | <integer> > 1 A [ 1, 3] | 0 > 2 A [ 4, 4] | 1 > 3 A [ 5, 5] | 2 > 4 A [ 6, 6] | 3 > 5 A [ 7, 7] | 2 > 6 A [ 8, 8] | 1 > 7 A [ 9, 9] | 2 > 8 A [10, 10] | 4 > 9 A [11, 13] | 6 > ... ... ... ... ... > 14 A [22, 22] | 3 > 15 A [23, 30] | 0 > 16 B [ 1, 2] | 8 > 17 B [ 3, 3] | 13 > 18 B [ 4, 4] | 14 > 19 B [ 5, 5] | 15 > 20 B [ 6, 6] | 10 > 21 B [ 7, 7] | 11 > 22 B [ 8, 10] | 12 > > > sessionInfo() > R version 2.11.0 Patched (2010-04-24 r51820) > i386-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] IRanges_1.6.1 > > loaded via a namespace (and not attached): > [1] tools_2.11.0 > > > > On 4/23/10 6:26 AM, Michael Lawrence wrote: >> Also note that it's not really necessary to loop here, as is often >> the case >> with IRanges: >> >> rd$windo<- unlist(runmean(RleList(values(rd)[,"score"]))) >> >> On Fri, Apr 23, 2010 at 6:13 AM, Michael Lawrence<michafla at="" gene.com=""> >> wrote: >> >>> >>> On Fri, Apr 23, 2010 at 1:49 AM, Yvan<yvan.strahm at="" uni.no=""> wrote: >>> >>>> Hello, >>>> Thank you both of you. >>>> >>>> I could could calculate the sliding window, but not as a Rle >>>> object, could >>>> not append values for the last w-1 position in the Rle object in >>>> order to >>>> take care of the size problem. >>>> >>>> >>> Why not? Rle supports all the normal vector operations. And runsum or >>> runmean will output a vector of the same size as the input, using a >>> choice >>> of two endrules. If you want 0's at the end, try something like: >>> >>> rle[(nrow(rd)-w+1):nrow(rd)]<- 0 >>> >>> >>> >>>> So I did it like that: >>>> >>>> params<-RDApplyParams(rd,function(rd) >>>> append((diff(c(0,cumsum(rd$score)),lag=w)/w),rep(0,each=w-1),afte r=(length(rd$score)-w+1))) >>>> >>>> >>>> But when I try to add the new values to the rangedData object I got >>>> these >>>> error. >>>> >>>>> values(rd)[,"windo"]<-rdapply(params) >>>> Error in [<-(*tmp*, , j, value =<s4 object="" of="" class="">>>> "DataFrame">) : >>>> ncol(x[j]) != ncol(value) >>>> In addition: Warning messages: >>>> 1: In mapply(f, ..., SIMPLIFY = FALSE) : >>>> longer argument not a multiple of length of shorter >>>> 2: In mapply(f, ..., SIMPLIFY = FALSE) : >>>> longer argument not a multiple of length of shorter >>>> >>>> But when I check the size, they are the same, here for one space >>>> >>>>> x<-rdapply(params) >>>>> length(x$SCAFFOLD_100) == length(rd["SCAFFOLD_100"]$windo) >>>> [1] TRUE >>>> >>>> >>> It may be that that type of insertion is unsupported. Why not just do >>> something like: >>> >>> rd$window<- unlist(x) >>> >>> >>>> Maybe params miss a parameter or the way I try to update the rd >>>> object is >>>> wrong. Anyway form the rdapply output a vector could be created and >>>> so a new >>>> rd object with the new value column. >>>> >>>> yvan >>>> >>>> >>>> On 22/04/10 15:51, Michael Lawrence wrote: >>>> >>>> >>>> >>>> On Thu, Apr 22, 2010 at 5:49 AM, Michael >>>> Dondrup<michael.dondrup at="" uni.no="">wrote: >>>> >>>>> Hi, >>>>> how about funtion rdapply (not lapply) which is for that? >>>>> >>>>> >>>> lapply() should apply per-space as well, basically providing a >>>> short-cut >>>> for the more complicated rdapply(). >>>> >>>>> lapply(rd, function(x) sum(x$score)) >>>> $chr1 >>>> [1] 3 >>>> >>>>$chr2 >>>> [1] 0 >>>> >>>> sapply() also works: >>>>> sapply(rd, function(x) sum(x$score)) >>>> chr1 chr2 >>>> 3 0 >>>> >>>> Another choice is tapply: >>>>> tapply(rd$score, space(rd), sum) >>>> chr1 chr2 >>>> 3 0 >>>> >>>> Michael >>>> >>>> The code below computes the sum score for each space in the >>>> RangedData: >>>>> # taken from the examples mostly: >>>>>> ranges<- IRanges(c(1,2,3),c(4,5,6)) >>>>>> score<- c(2L, 0L, 1L) >>>>>> rd<- RangedData(ranges, score, space = c("chr1","chr2","chr1")) >>>>>> rd >>>>> RangedData with 3 rows and 1 value column across 2 spaces >>>>> space ranges | score >>>>> <character> <iranges> |<integer> >>>>> 1 chr1 [1, 4] | 2 >>>>> 2 chr1 [3, 6] | 1 >>>>> 3 chr2 [2, 5] | 0 >>>>>> params<- RDApplyParams(rd, function(rd) sum(score(rd))) >>>>>> rdapply(params) >>>>> $chr1 >>>>> [1] 3 >>>>> >>>>>$chr2 >>>>> [1] 0 >>>>> >>>>> >>>>> Cheers >>>>> Michael >>>>> >>>>> Am Apr 22, 2010 um 1:57 PM schrieb Yvan: >>>>> >>>>>> On 21/04/10 18:43, Michael Lawrence wrote: >>>>>>> >>>>>>> On Wed, Apr 21, 2010 at 6:07 AM, Yvan<yvan.strahm at="" uni.no="">>>>>>> <mailto:yvan.strahm at="" uni.no="">> wrote: >>>>>>> >>>>>>> Hello List, >>>>>>> >>>>>>> I am confused about how to loop over a rangedData object. >>>>>>> I have this rangedData >>>>>>> >>>>>>> RangedData with 61 rows and 1 value column across 3 spaces >>>>>>> space ranges | score >>>>>>> <character> <iranges> |<numeric> >>>>>>> 1 SCAFFOLD_1 [ 8, 8] | -0.09405 >>>>>>> >>>>>>> and the spaces are >>>>>>> >>>>>>> "SCAFFOLD_1" "SCAFFOLD_10" "SCAFFOLD_100" >>>>>>> >>>>>>> using aggregate it is possible to apply a function to one of >>>>>>> the >>>>> space >>>>>>> aggregate(rd["SCAFFOLD_1"]$score, start = >>>>>>> 1:(length(rd["SCAFFOLD_1"]$score)-w+1), width = w, FUN = sum) >>>>>>> >>>>>>> but how can I apply the aggregate to all space without a for >>>>>>> loop ? >>>>>>> >>>>>>> >>>>>>> It looks like you're attempting a running window sum of the score >>>>>>> vector. There are more efficient ways of doing this besides >>>>>>> aggregate(). If you convert the score into an Rle, you can use >>>>> runsum(). >>>>>>> Anyway, to do this over each space individually, use lapply(). >>>>>>> >>>>>>> This would come out to something like: >>>>>>> >>>>>>> values(rd)[,"smoothScore"]<- lapply(rd, function(x) >>>>>>> runsum(Rle(x$score), w)) >>>>>>> >>>>>>> Probably not exactly right, but it gets you in the right >>>>>>> direction... >>>>>>> >>>>>>> Michael >>>>>>> >>>>>>> >>>>>> Hello Michael, >>>>>> >>>>>> Thanks for the answer and the tip about runsum! >>>>>> I try with lapply but could not get it working right, the main >>>>>> problem >>>>>> is that the runsum is calculated on all values and not for a each >>>>>> specific spaces. >>>>>> Sorry, I should have been more precise in the problem description. >>>>>> The runsum should be calculated in a space specific manner, let >>>>>> say w=2 >>>>>> >>>>>> space score cumsum >>>>>> 1 space1 1 3 >>>>>> 2 space1 2 4 >>>>>> 3 space1 2 NA >>>>>> 4 space2 10 21 >>>>>> 5 space2 11 22 >>>>>> 6 space2 11 NA >>>>>> >>>>>> Is it possible to do it with lapply? >>>>>> Thanks again for your help >>>>>> cheers, >>>>>> yvan >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at stat.math.ethz.ch >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at stat.math.ethz.ch >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >>>> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > ADD COMMENTlink modified 9.4 years ago by Michael Lawrence11k • written 9.4 years ago by Yvan60 Answer: loop over IRanges spaces 0 9.4 years ago by United States Michael Lawrence11k wrote: On Tue, May 11, 2010 at 2:15 AM, Yvan <yvan.strahm@uni.no> wrote: > Hello, > > I tried your code below, but and iI am not able to create the scoreRleList > as weight could be only a integer list. > > > read_gff > function(file="gff3.txt") { > gff <- read.delim(file,header=FALSE) > colnames(gff) <- c("seqname", "source", "feature", "start", "end", > "score", "strand","frame", "comments") > return(gff) > } > The rtracklayer package would do the above for you. > > library(IRanges) > > Attaching package: 'IRanges' > > The following object(s) are masked from 'package:base': > > cbind, Map, mapply, order, paste, pmax, pmax.int, pmin, pmin.int, > rbind, rep.int, table > > > > gff<-read_gff("/home/yvan/Lundalm/R/projects/coen/gff_original/50Lin es.gff") > > > rd<-RangedData(IRanges(gff$start,width=1),score=gff$score,start=gff$ start,space=gff$seqname) > > scoreRleList<-coverage(rd, weight ="score",width = > as.list(table(rd$space))) > Error in .local(x, shift, width, weight, ...) : > 'weight' must be a non-empty list of integers > > > Can I change/force the type of 'weight' ? > > It sounds like your scores are floating point values, so they become a numeric vector in R. Just coerce them with as.integer: rd$score <- as.integer(rd$score) That might get you closer. You might want to multiply by some power of 10 first to save some precision. > > In a different approach, I try to use different views on the data set. This > was motivated by the fact that the difference on the start of each probes > could indicate either a change of scaffold or a gap inside a scaffold. > > > first_view<-slice(diff(rd$start), lower=10, upper=200) > > first_view > Views on a 60-integer XInteger subject > subject: 28 24 25 37 23 24 32 34 25 23 63 ... 24 35 24 29 31 28 19 29 27 29 > 37 > views: > start end width > [1] 1 19 19 [28 24 25 37 23 24 32 34 25 23 63 21 36 27 19 33 33 24 > 32] > [2] 21 39 19 [ 31 22 34 18 36 25 ... 31 108 34 28 26 23 > 33] > [3] 41 60 20 [32 28 26 20 33 25 30 30 24 ... 29 31 28 19 29 27 29 37 > 22] > > > > so here the three views are the three scaffold. but if I want to change the > subject of the view to the score > > > second_view<-Views(subject=rd$score,start=first_view$start,end=first _view$end) > Error in function (classes, fdef, mtable) : > unable to find an inherited method for function "Views", for signature > "numeric" > > > > Again can I change the expected type of the signature in the Views > function. > > That would take a lot of work, involving a new NumericView class. Much simpler to just coerce the scores to integer as in the above. Michael Thanks again for your help. > Cheers, > yvan > > > On 04/26/2010 09:08 PM, Patrick Aboyoun wrote: > >> Yvan, >> It appears to me that you are trying to perform two conflicting >> activities: >> >> 1) Calculate the running sum of a metric over an annotated sequence (as >> evidenced by your aggregate function call) >> 2) Find the sum for a metric across specified intervals on the annotated >> sequence (as evidenced by your desire to assign the aggregated sums into an >> existing RangedData object) >> >> Taking a step back, I am guessing that you are trying to transform >> something akin to a UCSC bed file into something else that is UCSC bed file >> like. If you are using rtracklayer, this means your initial data are stored >> in a RangedData object. To create a RangedData object containing the running >> sum of a values column from an initial RangedData object, I recommend: >> >> 1) Creating an RleList object from the RangedData object using the >> coverage function. Make sure to specify the metric of interest in the weight >> argument to coverage. >> 2) Using the runsum function on the RleList object to calculate your >> running sums. >> 3) Creating a RangedData object from the RleList object in step 2 using >> as(<<obj>>, "RangedData") >> >> Here is an example: >> >> > # Step 1: create an RleList representation of the metric >> > rd <- RangedData(IRanges(start = c(5, 10, 15, 2, 4, 8), end = c(7, 14, >> 21, 3, 6, 9)), >> score = 1:6, space = rep(c("A", "B"), each = 3)) >> > scoreRleList <- coverage(rd, weight = "score", width = list(A = 30, B = >> 10)) >> > scoreRleList >> SimpleRleList of length 2 >> $A >> 'integer' Rle of length 30 with 6 runs >> Lengths: 4 3 2 5 7 9 >> Values : 0 1 0 2 3 0 >> >>$B >> 'integer' Rle of length 10 with 6 runs >> Lengths: 1 2 3 1 2 1 >> Values : 0 4 5 0 6 0 >> >> > # Step 2: calculate the running sums >> > scoreRunsum <- runsum(scoreRleList, k = 3, endrule = "constant") >> > scoreRunsum >> SimpleRleList of length 2 >> $A >> 'integer' Rle of length 30 with 15 runs >> Lengths: 3 1 1 1 1 1 1 1 3 1 1 5 1 1 8 >> Values : 0 1 2 3 2 1 2 4 6 7 8 9 6 3 0 >> >>$B >> 'integer' Rle of length 10 with 7 runs >> Lengths: 2 1 1 1 1 1 3 >> Values : 8 13 14 15 10 11 12 >> >> >> > # Step 3: Create a RangedData representation of the running sums >> > rdRunsum <- as(scoreRunsum, "RangedData") >> > rdRunsum >> RangedData with 22 rows and 1 value column across 2 spaces >> space ranges | score >> <character> <iranges> | <integer> >> 1 A [ 1, 3] | 0 >> 2 A [ 4, 4] | 1 >> 3 A [ 5, 5] | 2 >> 4 A [ 6, 6] | 3 >> 5 A [ 7, 7] | 2 >> 6 A [ 8, 8] | 1 >> 7 A [ 9, 9] | 2 >> 8 A [10, 10] | 4 >> 9 A [11, 13] | 6 >> ... ... ... ... ... >> 14 A [22, 22] | 3 >> 15 A [23, 30] | 0 >> 16 B [ 1, 2] | 8 >> 17 B [ 3, 3] | 13 >> 18 B [ 4, 4] | 14 >> 19 B [ 5, 5] | 15 >> 20 B [ 6, 6] | 10 >> 21 B [ 7, 7] | 11 >> 22 B [ 8, 10] | 12 >> >> > sessionInfo() >> R version 2.11.0 Patched (2010-04-24 r51820) >> i386-apple-darwin9.8.0 >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] IRanges_1.6.1 >> >> loaded via a namespace (and not attached): >> [1] tools_2.11.0 >> >> >> >> On 4/23/10 6:26 AM, Michael Lawrence wrote: >> >>> Also note that it's not really necessary to loop here, as is often the >>> case >>> with IRanges: >>> >>> rd$windo<- unlist(runmean(RleList(values(rd)[,"score"]))) >>> >>> On Fri, Apr 23, 2010 at 6:13 AM, Michael Lawrence<michafla@gene.com> >>> wrote: >>> >>> >>>> On Fri, Apr 23, 2010 at 1:49 AM, Yvan<yvan.strahm@uni.no> wrote: >>>> >>>> Hello, >>>>> Thank you both of you. >>>>> >>>>> I could could calculate the sliding window, but not as a Rle object, >>>>> could >>>>> not append values for the last w-1 position in the Rle object in order >>>>> to >>>>> take care of the size problem. >>>>> >>>>> >>>>> Why not? Rle supports all the normal vector operations. And runsum or >>>> runmean will output a vector of the same size as the input, using a >>>> choice >>>> of two endrules. If you want 0's at the end, try something like: >>>> >>>> rle[(nrow(rd)-w+1):nrow(rd)]<- 0 >>>> >>>> >>>> >>>> So I did it like that: >>>>> >>>>> params<-RDApplyParams(rd,function(rd) >>>>> append((diff(c(0,cumsum(rd$score)),lag=w)/w),rep(0,each=w-1),aft er=(length(rd$score)-w+1))) >>>>> >>>>> >>>>> But when I try to add the new values to the rangedData object I got >>>>> these >>>>> error. >>>>> >>>>> values(rd)[,"windo"]<-rdapply(params) >>>>>> >>>>> Error in [<-(*tmp*, , j, value =<s4 object="" of="" class="" "dataframe"="">) : >>>>> ncol(x[j]) != ncol(value) >>>>> In addition: Warning messages: >>>>> 1: In mapply(f, ..., SIMPLIFY = FALSE) : >>>>> longer argument not a multiple of length of shorter >>>>> 2: In mapply(f, ..., SIMPLIFY = FALSE) : >>>>> longer argument not a multiple of length of shorter >>>>> >>>>> But when I check the size, they are the same, here for one space >>>>> >>>>> x<-rdapply(params) >>>>>> length(x$SCAFFOLD_100) == length(rd["SCAFFOLD_100"]$windo) >>>>>> >>>>> [1] TRUE >>>>> >>>>> >>>>> It may be that that type of insertion is unsupported. Why not just do >>>> something like: >>>> >>>> rd$window<- unlist(x) >>>> >>>> >>>> Maybe params miss a parameter or the way I try to update the rd object >>>>> is >>>>> wrong. Anyway form the rdapply output a vector could be created and so >>>>> a new >>>>> rd object with the new value column. >>>>> >>>>> yvan >>>>> >>>>> >>>>> On 22/04/10 15:51, Michael Lawrence wrote: >>>>> >>>>> >>>>> >>>>> On Thu, Apr 22, 2010 at 5:49 AM, Michael Dondrup< >>>>> Michael.Dondrup@uni.no>wrote: >>>>> >>>>> Hi, >>>>>> how about funtion rdapply (not lapply) which is for that? >>>>>> >>>>>> >>>>>> lapply() should apply per-space as well, basically providing a >>>>> short-cut >>>>> for the more complicated rdapply(). >>>>> >>>>> lapply(rd, function(x) sum(x$score)) >>>>>> >>>>>$chr1 >>>>> [1] 3 >>>>> >>>>> $chr2 >>>>> [1] 0 >>>>> >>>>> sapply() also works: >>>>> >>>>>> sapply(rd, function(x) sum(x$score)) >>>>>> >>>>> chr1 chr2 >>>>> 3 0 >>>>> >>>>> Another choice is tapply: >>>>> >>>>>> tapply(rd$score, space(rd), sum) >>>>>> >>>>> chr1 chr2 >>>>> 3 0 >>>>> >>>>> Michael >>>>> >>>>> The code below computes the sum score for each space in the >>>>> RangedData: >>>>> >>>>>> # taken from the examples mostly: >>>>>> >>>>>>> ranges<- IRanges(c(1,2,3),c(4,5,6)) >>>>>>> score<- c(2L, 0L, 1L) >>>>>>> rd<- RangedData(ranges, score, space = c("chr1","chr2","chr1")) >>>>>>> rd >>>>>>> >>>>>> RangedData with 3 rows and 1 value column across 2 spaces >>>>>> space ranges | score >>>>>> <character> <iranges> |<integer> >>>>>> 1 chr1 [1, 4] | 2 >>>>>> 2 chr1 [3, 6] | 1 >>>>>> 3 chr2 [2, 5] | 0 >>>>>> >>>>>>> params<- RDApplyParams(rd, function(rd) sum(score(rd))) >>>>>>> rdapply(params) >>>>>>> >>>>>>$chr1 >>>>>> [1] 3 >>>>>> >>>>>> $chr2 >>>>>> [1] 0 >>>>>> >>>>>> >>>>>> Cheers >>>>>> Michael >>>>>> >>>>>> Am Apr 22, 2010 um 1:57 PM schrieb Yvan: >>>>>> >>>>>> On 21/04/10 18:43, Michael Lawrence wrote: >>>>>>> >>>>>>>> >>>>>>>> On Wed, Apr 21, 2010 at 6:07 AM, Yvan<yvan.strahm@uni.no>>>>>>>> <mailto:yvan.strahm@uni.no>> wrote: >>>>>>>> >>>>>>>> Hello List, >>>>>>>> >>>>>>>> I am confused about how to loop over a rangedData object. >>>>>>>> I have this rangedData >>>>>>>> >>>>>>>> RangedData with 61 rows and 1 value column across 3 spaces >>>>>>>> space ranges | score >>>>>>>> <character> <iranges> |<numeric> >>>>>>>> 1 SCAFFOLD_1 [ 8, 8] | -0.09405 >>>>>>>> >>>>>>>> and the spaces are >>>>>>>> >>>>>>>> "SCAFFOLD_1" "SCAFFOLD_10" "SCAFFOLD_100" >>>>>>>> >>>>>>>> using aggregate it is possible to apply a function to one of the >>>>>>>> >>>>>>> space >>>>>> >>>>>>> aggregate(rd["SCAFFOLD_1"]$score, start = >>>>>>>> 1:(length(rd["SCAFFOLD_1"]$score)-w+1), width = w, FUN = sum) >>>>>>>> >>>>>>>> but how can I apply the aggregate to all space without a for loop >>>>>>>> ? >>>>>>>> >>>>>>>> >>>>>>>> It looks like you're attempting a running window sum of the score >>>>>>>> vector. There are more efficient ways of doing this besides >>>>>>>> aggregate(). If you convert the score into an Rle, you can use >>>>>>>> >>>>>>> runsum(). >>>>>> >>>>>>> Anyway, to do this over each space individually, use lapply(). >>>>>>>> >>>>>>>> This would come out to something like: >>>>>>>> >>>>>>>> values(rd)[,"smoothScore"]<- lapply(rd, function(x) >>>>>>>> runsum(Rle(x$score), w)) >>>>>>>> >>>>>>>> Probably not exactly right, but it gets you in the right >>>>>>>> direction... >>>>>>>> >>>>>>>> Michael >>>>>>>> >>>>>>>> >>>>>>>> Hello Michael, >>>>>>> >>>>>>> Thanks for the answer and the tip about runsum! >>>>>>> I try with lapply but could not get it working right, the main >>>>>>> problem >>>>>>> is that the runsum is calculated on all values and not for a each >>>>>>> specific spaces. >>>>>>> Sorry, I should have been more precise in the problem description. >>>>>>> The runsum should be calculated in a space specific manner, let say >>>>>>> w=2 >>>>>>> >>>>>>> space score cumsum >>>>>>> 1 space1 1 3 >>>>>>> 2 space1 2 4 >>>>>>> 3 space1 2 NA >>>>>>> 4 space2 10 21 >>>>>>> 5 space2 11 22 >>>>>>> 6 space2 11 NA >>>>>>> >>>>>>> Is it possible to do it with lapply? >>>>>>> Thanks again for your help >>>>>>> cheers, >>>>>>> yvan >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioconductor mailing list >>>>>>> Bioconductor@stat.math.ethz.ch >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>> Search the archives: >>>>>>> >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor@stat.math.ethz.ch >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > [[alternative HTML version deleted]]