IRanges : Strange behavior subsetting an IntervalTree with an indexing variable from within a function
1
0
Entering edit mode
@steve-lianoglou-2771
Last seen 13 months ago
United States
Hi all, Sorry, I'm totally perplexed on what seems like something quite simple. I've tried to smoke this out in the debugger, but I still can't find an answer ... please bear with me as I explain what's going on. I'm using IRanges and trying to subset an IntervalTree using a vector of indices found from an ``overlap`` query. R doesn't like this and behaves as if the variable I'm using to subset my IntervalTree essentially doesn't exist, but it does. Here's the salient piece of code from WITHIN a function. I'll show you some results in a debugging session to illustrate what I mean. * ir is an IntervalTree * ranges.list is a SimpleRangesList * rl is therefore an IRanges object: dist <- lapply(range.list, function(rl) { hits <- subjectHits(overlap(ir, rl)) browser() # The debug calls below start from here if (length(hits) > 1) { diff(start(ir[hits])) } else { NA } }) The problem is that R somehow thinks the ``hits`` vector is non- existent when I'm trying to use it to index the IntervalTree (ir), however the ``hits`` vector is, in fact, defined. Browse[1]> hits [1] 158 159 160 161 OK, so hits exists, but: Browse[1]> ir[hits] Error in eval(expr, envir, enclos) : object 'hits' not found And if I just create a vector 158:161 "inline", indexing works fine: Browse[1]> ir[158:161] IntervalTree instance: start end width [1] 34576 34581 6 [2] 34608 34613 6 [3] 34635 34640 6 [4] 34888 34893 6 Using ANY variable to index doesn't work at all: Browse[1]> b <- 158:161 Browse[1]> b [1] 158 159 160 161 Browse[1]> ir[b] Error in eval(expr, envir, enclos) : object 'b' not found If I quite out of the browser() and, essentially, bail from the running function, these things now work against the same interval tree (it was passed in as one of the params to the function): Browse[1]> Q R> m <- 158:161 R> ir[m] IntervalTree instance: start end width [1] 34576 34581 6 [2] 34608 34613 6 [3] 34635 34640 6 [4] 34888 34893 6 I'm at a loss, I tried looking for the definition of '[' in the IRanges::IntervalTree implementation (not defined), but IRanges::'[' is ... dunno, it looks fine to me. Does anyone have an idea of what I'm doing wrong? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
Cancer IRanges Cancer IRanges • 1.2k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 13 months ago
United States
Hmm ... continuing from the previous email, switching the code out of lapply and into a for loop seems to fix. So, instead of: dist <- lapply(range.list, function(rl) { hits <- subjectHits(overlap(ir, rl)) browser() # The debug calls below start from here if (length(hits) > 1) { diff(start(ir[hits])) } else { NA } }) do this: dist <- list() for (idx in seq(range.list)) { hits <- subjectHits(overlap(ir, range.list[[idx]])) dist[[idx]] <- if (length(hits) > 1) { diff(start(ir[hits])) } else { NA } } That works. Weird, no? Of course, I forgot to mention sessionInfo() -- sorry. R> > sessionInfo() R version 2.10.0 Under development (unstable) (2009-09-07 r49613) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RSQLite_0.7-2 DBI_0.2-4 ShortRead_1.3.33 lattice_0.17-25 BSgenome_1.13.11 doMC_1.1.1 multicore_0.1-3 [8] foreach_1.2.1 codetools_0.2-2 iterators_1.0.2 Biostrings_2.13.36 IRanges_1.3.69 ARE.utils_0.1.0 loaded via a namespace (and not attached): [1] Biobase_2.5.5 grid_2.10.0 hwriter_1.1 tools_2.10.0 -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
Argh! Wrapping into the for loop actually didn't fix anything -- the symptoms of the problem have matured in a way that make even less sense to me now. It just seems as if the value of my indexing variable (``hits``) is stuck on the value that was first used when trying to index into my IntervalTree -- every time I use ``hits`` it always returns the same bunch of Intervals that were returned the first time my interval tree was indexed, even though its (hits) current value is different. For instance, in the 6th iteration of the loop: Browse[1]> hits [1] 3025 3026 3027 3028 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 Browse[1]> ir[hits] IntervalTree instance: start end width [1] 34576 34581 6 [2] 34608 34613 6 [3] 34635 34640 6 [4] 34888 34893 6 Clearly that's wrong there should be 13 intervals returned. Those values being returned are from the "hits" vector that was calculated on the first pass of the loop, which was 158:161. This was calc'd 5 iterations ago, though, and should be long since gone!. Browse[1]> ir[158:161] IntervalTree instance: start end width [1] 34576 34581 6 [2] 34608 34613 6 [3] 34635 34640 6 [4] 34888 34893 6 Let's make ``hits`` something completely inane: Browse[1]> hits <- 0 Browse[1]> ir[hits] IntervalTree instance: start end width [1] 34576 34581 6 [2] 34608 34613 6 [3] 34635 34640 6 [4] 34888 34893 6 Meh ... this is only happening when indexing my IntervalTree, though: Browse[1]> m <- matrix(1:100, 10,10) Browse[1]> m[hits] integer(0) Still -- indexing with a NEW var, one that was never defined before, fails again on the IntervalTree: Browse[1]> yy <- 1:10 Browse[1]> ir[yy] Error in eval(expr, envir, enclos) : object 'yy' not found Browse[1]> m[yy] [1] 1 2 3 4 5 6 7 8 9 10 I'm stuck in bizarro land, and don't know how to get out ... anybody have a map? -steve On Sep 9, 2009, at 1:03 PM, Steve Lianoglou wrote: > Hmm ... continuing from the previous email, switching the code out > of lapply and into a for loop seems to fix. > > So, instead of: > > dist <- lapply(range.list, function(rl) { > hits <- subjectHits(overlap(ir, rl)) > browser() # The debug calls below start from here > if (length(hits) > 1) { > diff(start(ir[hits])) > } else { > NA > } > }) > > do this: > > dist <- list() > for (idx in seq(range.list)) { > hits <- subjectHits(overlap(ir, range.list[[idx]])) > dist[[idx]] <- if (length(hits) > 1) { > diff(start(ir[hits])) > } else { > NA > } > } > > That works. Weird, no? > > Of course, I forgot to mention sessionInfo() -- sorry. > > R> > sessionInfo() > R version 2.10.0 Under development (unstable) (2009-09-07 r49613) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] RSQLite_0.7-2 DBI_0.2-4 ShortRead_1.3.33 > lattice_0.17-25 BSgenome_1.13.11 doMC_1.1.1 > multicore_0.1-3 > [8] foreach_1.2.1 codetools_0.2-2 iterators_1.0.2 > Biostrings_2.13.36 IRanges_1.3.69 ARE.utils_0.1.0 > > loaded via a namespace (and not attached): > [1] Biobase_2.5.5 grid_2.10.0 hwriter_1.1 tools_2.10.0 > > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact >
ADD REPLY
0
Entering edit mode
Hi Steve -- I suspect the developer will get to this shortly, and that this has to do with internal issues in R. So your best bet is to give it a break for a couple of hours. In the mean time, you can find out what's going on with > selectMethod("[", "IntervalTree") Method Definition: function (x, i, j, ..., drop = TRUE) { cl <- class(x) mc <- match.call() mc$x <- as(x, "IRanges") as(eval(mc), cl) } <environment: namespace:iranges=""> Signatures: x i j target "IntervalTree" "ANY" "ANY" defined "Ranges" "ANY" "ANY" > trace("[", signature="IntervalTree", tracer=browser) Tracing specified method for function "[" in package "base" [1] "[" > query <- IRanges(c(1, 4, 9), c(5, 7, 10)) > subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) > tree <- IntervalTree(subject) > tree[1:2] Tracing tree[1:2] on entry Called from: eval(expr, envir, enclos) Browse[1]> n my guess is that 'eval' is in the wrong frame, probably because of the way the methods package manipulates things. I'm not sure why "[",IntervalTree was implemented this way. Martin Steve Lianoglou wrote: > Argh! > > Wrapping into the for loop actually didn't fix anything -- the symptoms > of the problem have matured in a way that make even less sense to me now. > > It just seems as if the value of my indexing variable (``hits``) is > stuck on the value that was first used when trying to index into my > IntervalTree -- every time I use ``hits`` it always returns the same > bunch of Intervals that were returned the first time my interval tree > was indexed, even though its (hits) current value is different. > > For instance, in the 6th iteration of the loop: > > Browse[1]> hits > [1] 3025 3026 3027 3028 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 > > Browse[1]> ir[hits] > IntervalTree instance: > start end width > [1] 34576 34581 6 > [2] 34608 34613 6 > [3] 34635 34640 6 > [4] 34888 34893 6 > > Clearly that's wrong there should be 13 intervals returned. > > Those values being returned are from the "hits" vector that was > calculated on the first pass of the loop, which was 158:161. This was > calc'd 5 iterations ago, though, and should be long since gone!. > > Browse[1]> ir[158:161] > IntervalTree instance: > start end width > [1] 34576 34581 6 > [2] 34608 34613 6 > [3] 34635 34640 6 > [4] 34888 34893 6 > > Let's make ``hits`` something completely inane: > > Browse[1]> hits <- 0 > Browse[1]> ir[hits] > IntervalTree instance: > start end width > [1] 34576 34581 6 > [2] 34608 34613 6 > [3] 34635 34640 6 > [4] 34888 34893 6 > > Meh ... this is only happening when indexing my IntervalTree, though: > > Browse[1]> m <- matrix(1:100, 10,10) > Browse[1]> m[hits] > integer(0) > > > Still -- indexing with a NEW var, one that was never defined before, > fails again on the IntervalTree: > > Browse[1]> yy <- 1:10 > Browse[1]> ir[yy] > Error in eval(expr, envir, enclos) : object 'yy' not found > > Browse[1]> m[yy] > [1] 1 2 3 4 5 6 7 8 9 10 > > I'm stuck in bizarro land, and don't know how to get out ... anybody > have a map? > > -steve > > On Sep 9, 2009, at 1:03 PM, Steve Lianoglou wrote: > >> Hmm ... continuing from the previous email, switching the code out of >> lapply and into a for loop seems to fix. >> >> So, instead of: >> >> dist <- lapply(range.list, function(rl) { >> hits <- subjectHits(overlap(ir, rl)) >> browser() # The debug calls below start from here >> if (length(hits) > 1) { >> diff(start(ir[hits])) >> } else { >> NA >> } >> }) >> >> do this: >> >> dist <- list() >> for (idx in seq(range.list)) { >> hits <- subjectHits(overlap(ir, range.list[[idx]])) >> dist[[idx]] <- if (length(hits) > 1) { >> diff(start(ir[hits])) >> } else { >> NA >> } >> } >> >> That works. Weird, no? >> >> Of course, I forgot to mention sessionInfo() -- sorry. >> >> R> > sessionInfo() >> R version 2.10.0 Under development (unstable) (2009-09-07 r49613) >> x86_64-apple-darwin9.8.0 >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] RSQLite_0.7-2 DBI_0.2-4 ShortRead_1.3.33 >> lattice_0.17-25 BSgenome_1.13.11 doMC_1.1.1 multicore_0.1-3 >> [8] foreach_1.2.1 codetools_0.2-2 iterators_1.0.2 >> Biostrings_2.13.36 IRanges_1.3.69 ARE.utils_0.1.0 >> >> loaded via a namespace (and not attached): >> [1] Biobase_2.5.5 grid_2.10.0 hwriter_1.1 tools_2.10.0 >> >> -steve >> >> -- >> Steve Lianoglou >> Graduate Student: Computational Systems Biology >> | Memorial Sloan-Kettering Cancer Center >> | Weill Medical College of Cornell University >> Contact Info: http://cbio.mskcc.org/~lianos/contact >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi, On Sep 9, 2009, at 2:12 PM, Martin Morgan wrote: > Hi Steve -- I suspect the developer will get to this shortly, and that > this has to do with internal issues in R. So your best bet is to > give it > a break for a couple of hours. > > In the mean time, you can find out what's going on with > >> selectMethod("[", "IntervalTree") Hah ... wow, need to learn S4 ... I (dumbly) thought it was using the '[' defined for IRanges classes, even though inherits(myIntervalTree, 'IRanges') is FALSE ... > Method Definition: > > function (x, i, j, ..., drop = TRUE) > { > cl <- class(x) > mc <- match.call() > mc$x <- as(x, "IRanges") > as(eval(mc), cl) > } > <environment: namespace:iranges=""> > > Signatures: > x i j > target "IntervalTree" "ANY" "ANY" > defined "Ranges" "ANY" "ANY" >> trace("[", signature="IntervalTree", tracer=browser) > Tracing specified method for function "[" in package "base" > [1] "[" >> query <- IRanges(c(1, 4, 9), c(5, 7, 10)) >> subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) >> tree <- IntervalTree(subject) >> tree[1:2] > Tracing tree[1:2] on entry > Called from: eval(expr, envir, enclos) > Browse[1]> n > > my guess is that 'eval' is in the wrong frame, probably because of the > way the methods package manipulates things. I'm not sure why > "[",IntervalTree was implemented this way. Indeed! Anyway, from your hint I found the appropriate function (notice the fixme note :-) in Ranges-class.R: ### FIXME: hopefully temporary setMethod("[", "Ranges", function(x, i, j, ..., drop) { cl <- class(x) mc <- match.call() mc$x <- as(x, "IRanges") as(eval(mc), cl) } ) in Ranges-class.R I tweaked it to capture the parent.frame and now it works: setMethod("[", "Ranges", function(x, i, j, ..., drop) { cl <- class(x) mc <- match.call() pf <- parent.frame() mc$x <- as(x, "IRanges") as(eval(mc, pf), cl) } ) It seems that the match.call is taking the name of my the variable I'm passing into the argument as i, instead of using ``i`` itself, for instance, using your example. R> idx <- 1:2 R> tree[idx] Looking at the value of mc, we see: debug: mc$x <- as(x, "IRanges") Browse[2]> mc the.tree[i = idx] So ... sending the parent.frame allows the eval to see the appropriate idx var. I'm not sure if that's the fix you should use, but it seems to be working atm. Thanks for the heads up, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
Steve, Thanks for uncovering this issue. I have checked in a BioC 2.5 fix to the method along the lines you have stated: setMethod("[", "Ranges", function(x, i, j, ..., drop) { cl <- class(x) mc <- match.call() mc$x <- as(x, "IRanges") as(eval(mc, parent.frame()), cl) } ) This wasn't my code, but I would guess the author tried more robust alternatives (e.g. callGeneric()) to hand manipulating the function call, but was stymied by "[" being a privative method. The fix can be retrieved by svn now or by biocLite after tomorrow's build finishes. Patrick Steve Lianoglou wrote: > Hi, > > On Sep 9, 2009, at 2:12 PM, Martin Morgan wrote: > >> Hi Steve -- I suspect the developer will get to this shortly, and that >> this has to do with internal issues in R. So your best bet is to give it >> a break for a couple of hours. >> >> In the mean time, you can find out what's going on with >> >>> selectMethod("[", "IntervalTree") > > Hah ... wow, need to learn S4 ... I (dumbly) thought it was using the > '[' defined for IRanges classes, even though inherits(myIntervalTree, > 'IRanges') is FALSE ... > >> Method Definition: >> >> function (x, i, j, ..., drop = TRUE) >> { >> cl <- class(x) >> mc <- match.call() >> mc$x <- as(x, "IRanges") >> as(eval(mc), cl) >> } >> <environment: namespace:iranges=""> >> >> Signatures: >> x i j >> target "IntervalTree" "ANY" "ANY" >> defined "Ranges" "ANY" "ANY" >>> trace("[", signature="IntervalTree", tracer=browser) >> Tracing specified method for function "[" in package "base" >> [1] "[" >>> query <- IRanges(c(1, 4, 9), c(5, 7, 10)) >>> subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) >>> tree <- IntervalTree(subject) >>> tree[1:2] >> Tracing tree[1:2] on entry >> Called from: eval(expr, envir, enclos) >> Browse[1]> n >> >> my guess is that 'eval' is in the wrong frame, probably because of the >> way the methods package manipulates things. I'm not sure why >> "[",IntervalTree was implemented this way. > > Indeed! > > Anyway, from your hint I found the appropriate function (notice the > fixme note :-) in Ranges-class.R: > > ### FIXME: hopefully temporary > setMethod("[", "Ranges", > function(x, i, j, ..., drop) > { > cl <- class(x) > mc <- match.call() > mc$x <- as(x, "IRanges") > as(eval(mc), cl) > } > ) > > in Ranges-class.R > > I tweaked it to capture the parent.frame and now it works: > > setMethod("[", "Ranges", > function(x, i, j, ..., drop) > { > cl <- class(x) > mc <- match.call() > pf <- parent.frame() > mc$x <- as(x, "IRanges") > as(eval(mc, pf), cl) > } > ) > > It seems that the match.call is taking the name of my the variable I'm > passing into the argument as i, instead of using ``i`` itself, for > instance, using your example. > > R> idx <- 1:2 > R> tree[idx] > > Looking at the value of mc, we see: > > debug: mc$x <- as(x, "IRanges") > Browse[2]> mc > the.tree[i = idx] > > So ... sending the parent.frame allows the eval to see the appropriate > idx var. > > I'm not sure if that's the fix you should use, but it seems to be > working atm. > > Thanks for the heads up, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 687 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6