Search
Question: endoapply over GRangesList fails when an empty GRanges object is returned (Bug?)
0
gravatar for Steve Lianoglou
7.2 years ago by
Genentech
Steve Lianoglou12k wrote:
Hi all, I think this is the wrong behavior .. an endoapply(GRangesList, function(...)) will fail if the embedded function returns and empty GRanges object as one of its operations. Calling GRangesList(lapply(GRangesList, function(...))) will work. So too would an endoapply over an IRangesList that does the same thing ... In the code below -- the result that's meant to be stored in "fail" is what I'm talking about. The other two code blocks is just to show you that it works if you do an end-around of the endoapply(GRangesList, ...). I think I would expect the result that comes back in "good" down below to be the same thing that "fail" should be. gr <- GRanges(seqnames='chr1', strand='+', ranges=IRanges(start=c(sample(18:22, 10, replace=TRUE), sample(100:108, 10, replace=TRUE)), width=sample(18:20, 20, replace=TRUE))) strand(gr)[2:5] <- '-' grl <- split(gr, strand(gr)) itake <- IRanges(start=100, end=150) gtake <- GRanges(seqnames='chr1', ranges=itake) ## This will fail, don't think it should fail <- endoapply(grl, function(x) { subsetByOverlaps(x, gtake) }) ## This works, and ultimately is what the previous one should be good <- GRangesList(lapply(grl, function(x) { take <- GRanges(seqnames='chr1', ranges=itake) subsetByOverlaps(x, gtake) })) ## Test IRangesList: this works. irl <- IRangesList(lapply(grl, ranges)) check <- endoapply(irl, function(x) { subsetByOverlaps(x, itake) }) Thanks, -steve R version 2.12.0 Under development (unstable) (2010-08-18 r52771) Platform: x86_64-apple-darwin10.4.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] org.Hs.eg.db_2.4.1 AnnotationDbi_1.11.4 [3] Biobase_2.9.0 BSgenome.Hsapiens.UCSC.hg18_1.3.16 [5] ggplot2_0.8.8 proto_0.3-8 [7] reshape_0.8.3 doMC_1.2.1 [9] multicore_0.1-3 foreach_1.3.0 [11] codetools_0.2-2 iterators_1.0.3 [13] GenomeGraphs_1.9.0 biomaRt_2.5.1 [15] bitops_1.0-4.1 Rsamtools_1.1.12 [17] data.table_1.5 plyr_1.1 [19] BSgenome_1.17.6 Biostrings_2.17.28 [21] RSQLite_0.9-2 DBI_0.2-5 [23] GenomicFeatures_1.1.11 GenomicRanges_1.1.20 [25] IRanges_1.7.19 loaded via a namespace (and not attached): [1] RCurl_1.4-3 rtracklayer_1.9.7 tools_2.12.0 XML_3.1-1 -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENTlink modified 7.2 years ago by Michael Lawrence9.8k • written 7.2 years ago by Steve Lianoglou12k
0
gravatar for Michael Lawrence
7.2 years ago by
United States
Michael Lawrence9.8k wrote:
On Tue, Aug 24, 2010 at 2:21 PM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi all, > > I think this is the wrong behavior .. an endoapply(GRangesList, > function(...)) will fail if the embedded function returns and empty > GRanges object as one of its operations. > > Calling GRangesList(lapply(GRangesList, function(...))) will work. > So too would an endoapply over an IRangesList that does the same thing ... > > In the code below -- the result that's meant to be stored in "fail" is > what I'm talking about. The other two code blocks is just to show you > that it works if you do an end-around of the endoapply(GRangesList, > ...). > > I think I would expect the result that comes back in "good" down below > to be the same thing that "fail" should be. > > > gr <- GRanges(seqnames='chr1', strand='+', > ranges=IRanges(start=c(sample(18:22, 10, replace=TRUE), > sample(100:108, 10, replace=TRUE)), > width=sample(18:20, 20, replace=TRUE))) > strand(gr)[2:5] <- '-' > grl <- split(gr, strand(gr)) > > itake <- IRanges(start=100, end=150) > gtake <- GRanges(seqnames='chr1', ranges=itake) > > ## This will fail, don't think it should > fail <- endoapply(grl, function(x) { > subsetByOverlaps(x, gtake) > }) > > ## This works, and ultimately is what the previous one should be > good <- GRangesList(lapply(grl, function(x) { > take <- GRanges(seqnames='chr1', ranges=itake) > subsetByOverlaps(x, gtake) > })) > > ## Test IRangesList: this works. > irl <- IRangesList(lapply(grl, ranges)) > I assume someone else will take care of the bug, but wanted to mention that you could replace the above command with: irl <- seqapply(grl, ranges) endoapply is a stricter version of seqapply that enforces that the return value is the same type as the first argument. check <- endoapply(irl, function(x) { > subsetByOverlaps(x, itake) > }) > > Thanks, > -steve > > R version 2.12.0 Under development (unstable) (2010-08-18 r52771) > Platform: x86_64-apple-darwin10.4.0/x86_64 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] grid stats graphics grDevices utils datasets > methods base > > other attached packages: > [1] org.Hs.eg.db_2.4.1 AnnotationDbi_1.11.4 > [3] Biobase_2.9.0 BSgenome.Hsapiens.UCSC.hg18_1.3.16 > [5] ggplot2_0.8.8 proto_0.3-8 > [7] reshape_0.8.3 doMC_1.2.1 > [9] multicore_0.1-3 foreach_1.3.0 > [11] codetools_0.2-2 iterators_1.0.3 > [13] GenomeGraphs_1.9.0 biomaRt_2.5.1 > [15] bitops_1.0-4.1 Rsamtools_1.1.12 > [17] data.table_1.5 plyr_1.1 > [19] BSgenome_1.17.6 Biostrings_2.17.28 > [21] RSQLite_0.9-2 DBI_0.2-5 > [23] GenomicFeatures_1.1.11 GenomicRanges_1.1.20 > [25] IRanges_1.7.19 > > loaded via a namespace (and not attached): > [1] RCurl_1.4-3 rtracklayer_1.9.7 tools_2.12.0 XML_3.1-1 > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact<http: cbio.mskc="" c.org="" %7elianos="" contact=""> > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 7.2 years ago by Michael Lawrence9.8k
Steve, Thanks for reporting this issue. It turns out that the endoapply surfaced the error, but didn't cause it. The problem was upstream in the split,GRanges-method where there was a bug when splitting by a factor or 'factor' Rle that didn't have all levels present. (To check this try validObject(grl) on your example below.) This issue has been corrected in BioC 2.6 (GenomicRanges 1.0.8) and BioC 2.7 (GenomicRanges 1.1.21), which should be available for download from bioconductor.org within 36 hours. Cheers, Patrick On 8/24/10 2:36 PM, Michael Lawrence wrote: > On Tue, Aug 24, 2010 at 2:21 PM, Steve Lianoglou< > mailinglist.honeypot at gmail.com> wrote: > >> Hi all, >> >> I think this is the wrong behavior .. an endoapply(GRangesList, >> function(...)) will fail if the embedded function returns and empty >> GRanges object as one of its operations. >> >> Calling GRangesList(lapply(GRangesList, function(...))) will work. >> So too would an endoapply over an IRangesList that does the same thing ... >> >> In the code below -- the result that's meant to be stored in "fail" is >> what I'm talking about. The other two code blocks is just to show you >> that it works if you do an end-around of the endoapply(GRangesList, >> ...). >> >> I think I would expect the result that comes back in "good" down below >> to be the same thing that "fail" should be. >> >> >> gr<- GRanges(seqnames='chr1', strand='+', >> ranges=IRanges(start=c(sample(18:22, 10, replace=TRUE), >> sample(100:108, 10, replace=TRUE)), >> width=sample(18:20, 20, replace=TRUE))) >> strand(gr)[2:5]<- '-' >> grl<- split(gr, strand(gr)) >> >> itake<- IRanges(start=100, end=150) >> gtake<- GRanges(seqnames='chr1', ranges=itake) >> >> ## This will fail, don't think it should >> fail<- endoapply(grl, function(x) { >> subsetByOverlaps(x, gtake) >> }) >> >> ## This works, and ultimately is what the previous one should be >> good<- GRangesList(lapply(grl, function(x) { >> take<- GRanges(seqnames='chr1', ranges=itake) >> subsetByOverlaps(x, gtake) >> })) >> >> ## Test IRangesList: this works. >> irl<- IRangesList(lapply(grl, ranges)) >> > I assume someone else will take care of the bug, but wanted to mention that > you could replace the above command with: > > irl<- seqapply(grl, ranges) > > endoapply is a stricter version of seqapply that enforces that the return > value is the same type as the first argument. > > check<- endoapply(irl, function(x) { >> subsetByOverlaps(x, itake) >> }) >> >> Thanks, >> -steve >> >> R version 2.12.0 Under development (unstable) (2010-08-18 r52771) >> Platform: x86_64-apple-darwin10.4.0/x86_64 (64-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] grid stats graphics grDevices utils datasets >> methods base >> >> other attached packages: >> [1] org.Hs.eg.db_2.4.1 AnnotationDbi_1.11.4 >> [3] Biobase_2.9.0 BSgenome.Hsapiens.UCSC.hg18_1.3.16 >> [5] ggplot2_0.8.8 proto_0.3-8 >> [7] reshape_0.8.3 doMC_1.2.1 >> [9] multicore_0.1-3 foreach_1.3.0 >> [11] codetools_0.2-2 iterators_1.0.3 >> [13] GenomeGraphs_1.9.0 biomaRt_2.5.1 >> [15] bitops_1.0-4.1 Rsamtools_1.1.12 >> [17] data.table_1.5 plyr_1.1 >> [19] BSgenome_1.17.6 Biostrings_2.17.28 >> [21] RSQLite_0.9-2 DBI_0.2-5 >> [23] GenomicFeatures_1.1.11 GenomicRanges_1.1.20 >> [25] IRanges_1.7.19 >> >> loaded via a namespace (and not attached): >> [1] RCurl_1.4-3 rtracklayer_1.9.7 tools_2.12.0 XML_3.1-1 >> >> -- >> Steve Lianoglou >> Graduate Student: Computational Systems Biology >> | Memorial Sloan-Kettering Cancer Center >> | Weill Medical College of Cornell University >> Contact Info: http://cbio.mskcc.org/~lianos/contact<http: cbio.msk="" cc.org="" %7elianos="" contact=""> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLYlink written 7.2 years ago by Patrick Aboyoun1.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 151 users visited in the last hour