Confusion over inconsistencies with showMethods('Rle') when loaded via GenomicRanges
2
0
Entering edit mode
Peter Hickey ▴ 740
@petehaitch
Last seen 7 days ago
WEHI, Melbourne, Australia
There appear to be different methods available for 'Rle' when loaded via the GenomicRanges package depending on whether a GRanges object has been created. Specifically, prior to a GRanges object being created there are no 'values = character' methods for 'Rle'. This doesn't make sense to me and is causing me problems in code I am developing. The following code highlights the cause of my confusion: | > library(GenomicRanges) | Loading required package: BiocGenerics | | Attaching package: ‘BiocGenerics’ | | The following object(s) are masked from ‘package:stats’: | | xtabs | | The following object(s) are masked from ‘package:base’: | | anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find, | get, intersect, lapply, Map, mapply, mget, order, paste, pmax, | pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int, | rownames, sapply, setdiff, table, tapply, union, unique | Loading required package: IRanges | > showMethods('Rle') | Function: Rle (package IRanges) | values="missing", lengths="missing" | values="vectorORfactor", lengths="integer" | values="vectorORfactor", lengths="missing" | values="vectorORfactor", lengths="numeric" ## Only 4 methods are available for Rle | > seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1") | > gr <- GRanges(seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)), | + ranges = IRanges(1:10, width = 10:1, names = head(letters,10)), | + strand = Rle(strand(c("-", "+", "*", "+", "-")),c(1, 2, 2, 3, 2)), | + score = 1:10, GC = seq(1, 0, length=10), | + seqinfo = seqinfo) | > gr | GRanges with 10 ranges and 2 metadata columns: | seqnames ranges strand | score GC | <rle> <iranges> <rle> | <integer> <numeric> | a chr1 [ 1, 10] - | 1 1 | b chr2 [ 2, 10] + | 2 0.888888888888889 | c chr2 [ 3, 10] + | 3 0.777777777777778 | d chr2 [ 4, 10] * | 4 0.666666666666667 | e chr1 [ 5, 10] * | 5 0.555555555555556 | f chr1 [ 6, 10] + | 6 0.444444444444444 | g chr3 [ 7, 10] + | 7 0.333333333333333 | h chr3 [ 8, 10] + | 8 0.222222222222222 | i chr3 [ 9, 10] - | 9 0.111111111111111 | j chr3 [10, 10] - | 10 0 | --- | seqlengths: | chr1 chr2 chr3 | 1000 2000 1500 | > showMethods('Rle') | Function: Rle (package IRanges) | values="character", lengths="integer" | (inherited from: values="vectorORfactor", lengths="integer") | values="character", lengths="numeric" | (inherited from: values="vectorORfactor", lengths="numeric") | values="factor", lengths="integer" | (inherited from: values="vectorORfactor", lengths="integer") | values="factor", lengths="numeric" | (inherited from: values="vectorORfactor", lengths="numeric") | values="missing", lengths="missing" | values="vectorORfactor", lengths="integer" | values="vectorORfactor", lengths="missing" | values="vectorORfactor", lengths="numeric" ## Now, there are 8 methods available for Rle Is this a bug or am I missing something? If I'm just missing something, can someone please explain how I can ensure that the methods involving 'values = character' are available to me upon loading of the GenomicRanges package? Many thanks, Pete -------------------------------- Peter Hickey, PhD Student/Research Assistant, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Ph: +613 9345 2324 hickey@wehi.edu.au http://www.wehi.edu.au ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:8}}
GenomicRanges GenomicRanges • 2.4k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 14 months ago
United States
Hi, On Wed, Jan 30, 2013 at 4:57 AM, <hickey at="" wehi.edu.au=""> wrote: > There appear to be different methods available for 'Rle' when loaded via the GenomicRanges package depending on whether a GRanges object has been created. Specifically, prior to a GRanges object being created there are no 'values = character' methods for 'Rle'. This doesn't make sense to me and is causing me problems in code I am developing. > > The following code highlights the cause of my confusion: > > | > library(GenomicRanges) > | Loading required package: BiocGenerics > | > | Attaching package: ?BiocGenerics? > | > | The following object(s) are masked from ?package:stats?: > | > | xtabs > | > | The following object(s) are masked from ?package:base?: > | > | anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find, > | get, intersect, lapply, Map, mapply, mget, order, paste, pmax, > | pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int, > | rownames, sapply, setdiff, table, tapply, union, unique > > | Loading required package: IRanges > | > showMethods('Rle') > | Function: Rle (package IRanges) > | values="missing", lengths="missing" > | values="vectorORfactor", lengths="integer" > | values="vectorORfactor", lengths="missing" > | values="vectorORfactor", lengths="numeric" > > ## Only 4 methods are available for Rle > > | > seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1") > | > gr <- GRanges(seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)), > | + ranges = IRanges(1:10, width = 10:1, names = head(letters,10)), > | + strand = Rle(strand(c("-", "+", "*", "+", "-")),c(1, 2, 2, 3, 2)), > | + score = 1:10, GC = seq(1, 0, length=10), > | + seqinfo = seqinfo) > | > gr > | GRanges with 10 ranges and 2 metadata columns: > | seqnames ranges strand | score GC > | <rle> <iranges> <rle> | <integer> <numeric> > | a chr1 [ 1, 10] - | 1 1 > | b chr2 [ 2, 10] + | 2 0.888888888888889 > | c chr2 [ 3, 10] + | 3 0.777777777777778 > | d chr2 [ 4, 10] * | 4 0.666666666666667 > | e chr1 [ 5, 10] * | 5 0.555555555555556 > | f chr1 [ 6, 10] + | 6 0.444444444444444 > | g chr3 [ 7, 10] + | 7 0.333333333333333 > | h chr3 [ 8, 10] + | 8 0.222222222222222 > | i chr3 [ 9, 10] - | 9 0.111111111111111 > | j chr3 [10, 10] - | 10 0 > | --- > | seqlengths: > | chr1 chr2 chr3 > | 1000 2000 1500 > | > showMethods('Rle') > | Function: Rle (package IRanges) > | values="character", lengths="integer" > | (inherited from: values="vectorORfactor", lengths="integer") > | values="character", lengths="numeric" > | (inherited from: values="vectorORfactor", lengths="numeric") > | values="factor", lengths="integer" > | (inherited from: values="vectorORfactor", lengths="integer") > | values="factor", lengths="numeric" > | (inherited from: values="vectorORfactor", lengths="numeric") > | values="missing", lengths="missing" > | values="vectorORfactor", lengths="integer" > | values="vectorORfactor", lengths="missing" > | values="vectorORfactor", lengths="numeric" > > ## Now, there are 8 methods available for Rle > > Is this a bug or am I missing something? If I'm just missing something, can someone please explain how I can ensure that the methods involving 'values = character' are available to me upon loading of the GenomicRanges package? It's not a bug, it just means that Rle was used on a character vector, which doesn't have it's own signature and you are being told that it used the one defined from values="vectorOrFactor", as it is the one most closely related to the inputs that have been provided given the functions already defined and the class hierarchy. For instance, let's get on the same initial page: R> library(GenomicRanges) R> showMethods("Rle") Function: Rle (package IRanges) values="missing", lengths="missing" values="vectorORfactor", lengths="integer" values="vectorORfactor", lengths="missing" values="vectorORfactor", lengths="numeric" Ok -- now I try to create an Rle from a character vector: R> set.seed(123) R> x <- Rle(sample(letters[1:5], 100, replace=TRUE)) R> showMethods("Rle") Function: Rle (package IRanges) values="character", lengths="integer" (inherited from: values="vectorORfactor", lengths="integer") values="character", lengths="missing" (inherited from: values="vectorORfactor", lengths="missing") values="missing", lengths="missing" values="vectorORfactor", lengths="integer" values="vectorORfactor", lengths="missing" values="vectorORfactor", lengths="numeric" Looks like the initial call to Rle(x) triggers the c(values="character", lengths="missing") "version" of the function. There is no "direct/specific" implementation of this function, so R grabs the next closest thing (inherited from c(values="vectorORfactor", values="missing")). That function likely internally will call a version of the function with a signature like so c(values="character", lengths="integer"), which R wants to tell you has no direct implementation, and is using makes the second "inherited version" defined w/ vectorOrFactor and integer inputs. The question is -- what makes you think there is no version of Rle that accepts just a character vector as its first argument when you load GenomicRanges from the get go? Does the above example not work for you in a clean R session? I'm guessing something else is going wrong with your code, but we'll need some sort of minimal reproducible example to help sort that out. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
Hi Steve, Thanks for your explanation. I'm just learning about the S4 class and methods so I suspected I'd missed something. I ran your example on my machine and it returned the same output. I've now found the real problem in my code but don't understand why is causing inheritance problems for Rle. Basically, there's a line in my class definitions to define a class union, namely: setClassUnion('vectorOrNULL', c("vector", "NULL"). Depending on whether that line is included before I try to construct the GRanges object determines whether the object is successfully created. Can anyone please explain this to me? Here is a minimal reproducible example: ## This version works as intended > library(GenomicRanges) Loading required package: BiocGenerics Attaching package: ‘BiocGenerics’ The following object(s) are masked from ‘package:stats’: xtabs The following object(s) are masked from ‘package:base’: anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find, get, intersect, lapply, Map, mapply, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int, rownames, sapply, setdiff, table, tapply, union, unique Loading required package: IRanges > out <- list(chr = rep('chr21', 10), 1:10, start = 1:10, end = 2:11) > showMethods('Rle') Function: Rle (package IRanges) values="missing", lengths="missing" values="vectorORfactor", lengths="integer" values="vectorORfactor", lengths="missing" values="vectorORfactor", lengths="numeric" > gr <- GRanges(seqnames = out[['chr']], ranges = IRanges(start = out[['start']], end = out[['end']])) > gr GRanges with 10 ranges and 0 metadata columns: seqnames ranges strand <rle> <iranges> <rle> [1] chr21 [ 1, 2] * [2] chr21 [ 2, 3] * [3] chr21 [ 3, 4] * [4] chr21 [ 4, 5] * [5] chr21 [ 5, 6] * [6] chr21 [ 6, 7] * [7] chr21 [ 7, 8] * [8] chr21 [ 8, 9] * [9] chr21 [ 9, 10] * [10] chr21 [10, 11] * --- seqlengths: chr21 NA > showMethods('Rle') Function: Rle (package IRanges) values="character", lengths="integer" (inherited from: values="vectorORfactor", lengths="integer") values="character", lengths="missing" (inherited from: values="vectorORfactor", lengths="missing") values="factor", lengths="integer" (inherited from: values="vectorORfactor", lengths="integer") values="missing", lengths="missing" values="vectorORfactor", lengths="integer" values="vectorORfactor", lengths="missing" values="vectorORfactor", lengths="numeric" ## We have proper inheritance for Rle > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GenomicRanges_1.10.6 IRanges_1.16.4 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] parallel_2.15.2 stats4_2.15.2 ## But this version does not work as intended ## Firstly, start a fresh R session > library(GenomicRanges) Loading required package: BiocGenerics Attaching package: ‘BiocGenerics’ The following object(s) are masked from ‘package:stats’: xtabs The following object(s) are masked from ‘package:base’: anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find, get, intersect, lapply, Map, mapply, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int, rownames, sapply, setdiff, table, tapply, union, unique Loading required package: IRanges > setClassUnion("vectorOrNULL", c("vector", "NULL")) ## This line is the culprit > out <- list(chr = rep('chr21', 10), 1:10, start = 1:10, end = 2:11) > showMethods('Rle') Function: Rle (package IRanges) values="missing", lengths="missing" values="vectorORfactor", lengths="integer" values="vectorORfactor", lengths="missing" values="vectorORfactor", lengths="numeric" > gr <- GRanges(seqnames = out[['chr']], ranges = IRanges(start = out[['start']], end = out[['end']])) Error in function (classes, fdef, mtable) : unable to find an inherited method for function ‘Rle’ for signature ‘"character", "missing"’ > showMethods('Rle') Function: Rle (package IRanges) values="missing", lengths="missing" values="vectorORfactor", lengths="integer" values="vectorORfactor", lengths="missing" values="vectorORfactor", lengths="numeric" ## Inheritance problems for Rle > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GenomicRanges_1.10.6 IRanges_1.16.4 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] parallel_2.15.2 stats4_2.15.2 Cheers, Pete On 30/01/2013, at 10:39 PM, Steve Lianoglou wrote: > Hi, > > On Wed, Jan 30, 2013 at 4:57 AM, <hickey@wehi.edu.au> wrote: >> There appear to be different methods available for 'Rle' when loaded via the GenomicRanges package depending on whether a GRanges object has been created. Specifically, prior to a GRanges object being created there are no 'values = character' methods for 'Rle'. This doesn't make sense to me and is causing me problems in code I am developing. >> >> The following code highlights the cause of my confusion: >> >> | > library(GenomicRanges) >> | Loading required package: BiocGenerics >> | >> | Attaching package: ‘BiocGenerics’ >> | >> | The following object(s) are masked from ‘package:stats’: >> | >> | xtabs >> | >> | The following object(s) are masked from ‘package:base’: >> | >> | anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find, >> | get, intersect, lapply, Map, mapply, mget, order, paste, pmax, >> | pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int, >> | rownames, sapply, setdiff, table, tapply, union, unique >> >> | Loading required package: IRanges >> | > showMethods('Rle') >> | Function: Rle (package IRanges) >> | values="missing", lengths="missing" >> | values="vectorORfactor", lengths="integer" >> | values="vectorORfactor", lengths="missing" >> | values="vectorORfactor", lengths="numeric" >> >> ## Only 4 methods are available for Rle >> >> | > seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1") >> | > gr <- GRanges(seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)), >> | + ranges = IRanges(1:10, width = 10:1, names = head(letters,10)), >> | + strand = Rle(strand(c("-", "+", "*", "+", "-")),c(1, 2, 2, 3, 2)), >> | + score = 1:10, GC = seq(1, 0, length=10), >> | + seqinfo = seqinfo) >> | > gr >> | GRanges with 10 ranges and 2 metadata columns: >> | seqnames ranges strand | score GC >> | <rle> <iranges> <rle> | <integer> <numeric> >> | a chr1 [ 1, 10] - | 1 1 >> | b chr2 [ 2, 10] + | 2 0.888888888888889 >> | c chr2 [ 3, 10] + | 3 0.777777777777778 >> | d chr2 [ 4, 10] * | 4 0.666666666666667 >> | e chr1 [ 5, 10] * | 5 0.555555555555556 >> | f chr1 [ 6, 10] + | 6 0.444444444444444 >> | g chr3 [ 7, 10] + | 7 0.333333333333333 >> | h chr3 [ 8, 10] + | 8 0.222222222222222 >> | i chr3 [ 9, 10] - | 9 0.111111111111111 >> | j chr3 [10, 10] - | 10 0 >> | --- >> | seqlengths: >> | chr1 chr2 chr3 >> | 1000 2000 1500 >> | > showMethods('Rle') >> | Function: Rle (package IRanges) >> | values="character", lengths="integer" >> | (inherited from: values="vectorORfactor", lengths="integer") >> | values="character", lengths="numeric" >> | (inherited from: values="vectorORfactor", lengths="numeric") >> | values="factor", lengths="integer" >> | (inherited from: values="vectorORfactor", lengths="integer") >> | values="factor", lengths="numeric" >> | (inherited from: values="vectorORfactor", lengths="numeric") >> | values="missing", lengths="missing" >> | values="vectorORfactor", lengths="integer" >> | values="vectorORfactor", lengths="missing" >> | values="vectorORfactor", lengths="numeric" >> >> ## Now, there are 8 methods available for Rle >> >> Is this a bug or am I missing something? If I'm just missing something, can someone please explain how I can ensure that the methods involving 'values = character' are available to me upon loading of the GenomicRanges package? > > It's not a bug, it just means that Rle was used on a character vector, > which doesn't have it's own signature and you are being told that it > used the one defined from values="vectorOrFactor", as it is the one > most closely related to the inputs that have been provided given the > functions already defined and the class hierarchy. > > For instance, let's get on the same initial page: > > R> library(GenomicRanges) > R> showMethods("Rle") > Function: Rle (package IRanges) > values="missing", lengths="missing" > values="vectorORfactor", lengths="integer" > values="vectorORfactor", lengths="missing" > values="vectorORfactor", lengths="numeric" > > Ok -- now I try to create an Rle from a character vector: > > R> set.seed(123) > R> x <- Rle(sample(letters[1:5], 100, replace=TRUE)) > R> showMethods("Rle") > Function: Rle (package IRanges) > values="character", lengths="integer" > (inherited from: values="vectorORfactor", lengths="integer") > values="character", lengths="missing" > (inherited from: values="vectorORfactor", lengths="missing") > values="missing", lengths="missing" > values="vectorORfactor", lengths="integer" > values="vectorORfactor", lengths="missing" > values="vectorORfactor", lengths="numeric" > > > > Looks like the initial call to Rle(x) triggers the > c(values="character", lengths="missing") "version" of the function. > There is no "direct/specific" implementation of this function, so R > grabs the next closest thing (inherited from > c(values="vectorORfactor", values="missing")). That function likely > internally will call a version of the function with a signature like > so c(values="character", lengths="integer"), which R wants to tell you > has no direct implementation, and is using makes the second "inherited > version" defined w/ vectorOrFactor and integer inputs. > > The question is -- what makes you think there is no version of Rle > that accepts just a character vector as its first argument when you > load GenomicRanges from the get go? Does the above example not work > for you in a clean R session? > > I'm guessing something else is going wrong with your code, but we'll > need some sort of minimal reproducible example to help sort that out. > > HTH, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact -------------------------------- Peter Hickey, PhD Student/Research Assistant, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Ph: +613 9345 2324 hickey@wehi.edu.au http://www.wehi.edu.au ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:8}}
ADD REPLY
0
Entering edit mode
Hi Pete, On Wed, Jan 30, 2013 at 7:17 PM, <hickey at="" wehi.edu.au=""> wrote: > Hi Steve, > > Thanks for your explanation. I'm just learning about the S4 class and > methods so I suspected I'd missed something. I ran your example on my > machine and it returned the same output. > > I've now found the real problem in my code but don't understand why is > causing inheritance problems for Rle. Basically, there's a line in my class > definitions to define a class union, namely: setClassUnion('vectorOrNULL', > c("vector", "NULL"). Depending on whether that line is included before I try > to construct the GRanges object determines whether the object is > successfully created. Can anyone please explain this to me? [snip] > ## But this version does not work as intended > ## Firstly, start a fresh R session >> library(GenomicRanges) [snip] >> setClassUnion("vectorOrNULL", c("vector", "NULL")) ## This line is the >> culprit >> out <- list(chr = rep('chr21', 10), 1:10, start = 1:10, end = 2:11) >> showMethods('Rle') > Function: Rle (package IRanges) > values="missing", lengths="missing" > values="vectorORfactor", lengths="integer" > values="vectorORfactor", lengths="missing" > values="vectorORfactor", lengths="numeric" > >> gr <- GRanges(seqnames = out[['chr']], ranges = IRanges(start = >> out[['start']], end = out[['end']])) > Error in function (classes, fdef, mtable) : > unable to find an inherited method for function ?Rle? for signature > ?"character", "missing"? >> showMethods('Rle') > Function: Rle (package IRanges) > values="missing", lengths="missing" > values="vectorORfactor", lengths="integer" > values="vectorORfactor", lengths="missing" > values="vectorORfactor", lengths="numeric" > > ## Inheritance problems for Rle Interesting ... my guess is because with your new class union, both of these are now TRUE: R> is(c('a', 'b', 'c'), 'vectorORfactor') [1] TRUE R> is(c('a', 'b', 'c'), 'vectorOrNULL') [1] TRUE But it really feels like the class union shouldn't be getting in the way -- I mean, if one then writes an Rle method for c("vectorOrNULL", "missing"), I can imagine what the problem might be, but that's not the case here. Hmmm ... if I were a bit bolder, I'd hazard that this might even be a bug somewhere in some S4 dispatching mojo, but I'm not well-versed-enough in its voodoo to make that claim. I suspect Martin will likely chime in to point out what is the what, here ;-) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
On 01/30/2013 06:55 PM, Steve Lianoglou wrote: > Hi Pete, > > On Wed, Jan 30, 2013 at 7:17 PM, <hickey at="" wehi.edu.au=""> wrote: >> Hi Steve, >> >> Thanks for your explanation. I'm just learning about the S4 class and >> methods so I suspected I'd missed something. I ran your example on my >> machine and it returned the same output. >> >> I've now found the real problem in my code but don't understand why is >> causing inheritance problems for Rle. Basically, there's a line in my class >> definitions to define a class union, namely: setClassUnion('vectorOrNULL', >> c("vector", "NULL"). Depending on whether that line is included before I try >> to construct the GRanges object determines whether the object is >> successfully created. Can anyone please explain this to me? > [snip] > >> ## But this version does not work as intended >> ## Firstly, start a fresh R session >>> library(GenomicRanges) > [snip] > >>> setClassUnion("vectorOrNULL", c("vector", "NULL")) ## This line is the >>> culprit >>> out <- list(chr = rep('chr21', 10), 1:10, start = 1:10, end = 2:11) >>> showMethods('Rle') >> Function: Rle (package IRanges) >> values="missing", lengths="missing" >> values="vectorORfactor", lengths="integer" >> values="vectorORfactor", lengths="missing" >> values="vectorORfactor", lengths="numeric" >> >>> gr <- GRanges(seqnames = out[['chr']], ranges = IRanges(start = >>> out[['start']], end = out[['end']])) >> Error in function (classes, fdef, mtable) : >> unable to find an inherited method for function ?Rle? for signature >> ?"character", "missing"? >>> showMethods('Rle') >> Function: Rle (package IRanges) >> values="missing", lengths="missing" >> values="vectorORfactor", lengths="integer" >> values="vectorORfactor", lengths="missing" >> values="vectorORfactor", lengths="numeric" >> >> ## Inheritance problems for Rle > > Interesting ... my guess is because with your new class union, both of > these are now TRUE: > > R> is(c('a', 'b', 'c'), 'vectorORfactor') > [1] TRUE > > R> is(c('a', 'b', 'c'), 'vectorOrNULL') > [1] TRUE > > But it really feels like the class union shouldn't be getting in the > way -- I mean, if one then writes an Rle method for c("vectorOrNULL", > "missing"), I can imagine what the problem might be, but that's not > the case here. > > Hmmm ... if I were a bit bolder, I'd hazard that this might even be a > bug somewhere in some S4 dispatching mojo, but I'm not > well-versed-enough in its voodoo to make that claim. > > I suspect Martin will likely chime in to point out what is the what, here ;-) Yep, this is a puzzler. Here's what happens in a fresh R session: > setClassUnion("vectorORfactor", c("vector", "factor")) > getClass("numeric") Class "numeric" [package "methods"] No Slots, prototype of class "numeric" Extends: Class "vector", directly Class "vectorORfactor", by class "vector", distance 2 Known Subclasses: Class "integer", directly Class "ordered", by class "factor", distance 3 and then > setClassUnion("vectorOrNULL", c("vector", "NULL")) > getClass("numeric") Class "numeric" [package "methods"] No Slots, prototype of class "numeric" Extends: Class "vector", directly Class "vectorORfactor", by class "vector", distance 2 Class "vectorOrNULL", by class "vector", distance 2 Known Subclasses: Class "integer", directly Class "ordered", by class "factor", distance 3 Notice that "numeric" extends our two class unions. Now when we're dealing with a package, focusing on the 'Extends:' component library(IRanges) > getClass("numeric") ... Extends: Class "vector", directly Class "atomic", directly Class "vectorORfactor", by class "vector", distance 2 > setClassUnion("vectorOrNULL", c("vector", "NULL")) > getClass("numeric") ... Extends: Class "vector", directly Class "vectorOrNULL", by class "vector", distance 2 so we have replaced rather than amended Extends:. I think the error with method dispatch follows from this -- we end up looking for a method defined on vectorORNULL, and don't find one. I think the problem is in methods::assignClassDef, but things get a bit hairy for me; maybe there are class definitions for numeric that are found in IRanges, and in methods, and the latter over-writes the former? A work-around seems to be to setClassUnion() before loading IRanges. I find class unions pretty weird -- reach in to the class hierarchy and saying no, inheritance works _this_ way and at the same time making things complicated for ourselves because we always have to check whether the slot is a vector or NULL -- I wonder what you're hoping to accomplish with this? I know the pattern is well-established in IRanges... Martin > > -steve > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD REPLY
0
Entering edit mode
On 31/01/2013, at 5:45 PM, Martin Morgan wrote: >> [snip] > > Yep, this is a puzzler. Here's what happens in a fresh R session: > > > setClassUnion("vectorORfactor", c("vector", "factor")) > > getClass("numeric") > Class "numeric" [package "methods"] > > No Slots, prototype of class "numeric" > > Extends: > Class "vector", directly > Class "vectorORfactor", by class "vector", distance 2 > > Known Subclasses: > Class "integer", directly > Class "ordered", by class "factor", distance 3 > > and then > > > setClassUnion("vectorOrNULL", c("vector", "NULL")) > > getClass("numeric") > Class "numeric" [package "methods"] > > No Slots, prototype of class "numeric" > > Extends: > Class "vector", directly > Class "vectorORfactor", by class "vector", distance 2 > Class "vectorOrNULL", by class "vector", distance 2 > > Known Subclasses: > Class "integer", directly > Class "ordered", by class "factor", distance 3 > > Notice that "numeric" extends our two class unions. > > Now when we're dealing with a package, focusing on the 'Extends:' component > > library(IRanges) > > getClass("numeric") > ... > Extends: > Class "vector", directly > Class "atomic", directly > Class "vectorORfactor", by class "vector", distance 2 > > setClassUnion("vectorOrNULL", c("vector", "NULL")) > > getClass("numeric") > ... > Extends: > Class "vector", directly > Class "vectorOrNULL", by class "vector", distance 2 > > so we have replaced rather than amended Extends:. I think the error with method dispatch follows from this -- we end up looking for a method defined on vectorORNULL, and don't find one. > > I think the problem is in methods::assignClassDef, but things get a bit hairy for me; maybe there are class definitions for numeric that are found in IRanges, and in methods, and the latter over-writes the former? > > A work-around seems to be to setClassUnion() before loading IRanges. > > I find class unions pretty weird -- reach in to the class hierarchy and saying no, inheritance works _this_ way and at the same time making things complicated for ourselves because we always have to check whether the slot is a vector or NULL -- I wonder what you're hoping to accomplish with this? I know the pattern is well-established in IRanges... > > Martin > >> >> -steve >> > > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 Hi Martin and Steve, Thanks for taking the time to explain this to me. As I said, I'm just learning S4 classes and I think this is a case of me trying to move too fast, too soon :) Here's what I'm trying to accomplish with a setClassUnion(). I've based my class on the 'BSseq' class in the 'bsseq' package; partly for didactic purposes (it always helps me to have something to work-off when getting started with a new programming concept) and partly because I'm also dealing with BS-seq data. My reason for using a 'setClassUnion("vectorOrNULL'", c("vector", "NULL"))' is that for some of the slots the values aren't defined (i.e. they are NULL) when my object is first created; rather they are defined by a subsequent function call. In pseudo-code it goes something like this: > library(GenomicRanges) > setClassUnion("vectorOrNULL'", c("vector", "NULL")) > setClass('MyClass", representation(gr = "GRanges", model_parameters = "vectorOrNULL")) > in <- read_input_file(some_input_file) > in_gr <- input_file_to_GRanges(in) # Convert 'in' to a GRanges object > in_my_class <- MyClass(gr = in_gr, model_parameters = NULL) ## At this point, is.null(in_my_class@model_parameters) == TRUE ## Now, some modelling is done based on 'in_my_class'. in_my_class_modelled <- modeling_function(in_my_class) ## This function returns a MyClass object, with the 'model_parameters' slot now filled by a vector of model parameters. ## At this point, is.null(in_my_class@model_parameters) == TRUE, whereas, is.null(in_my_class_modelled@model_parameters) == FALSE Basically, I'd really like to have the model_parameters in the same MyClass object as the raw data (along with some other things, like sampleName etc.) but these aren't available at the point when the MyClass object is first constructed. The 'modelling_function()' returns a new MyClass object, with the 'gr' slot copied from the input 'in_my_class' and the 'model_parameters' slot filled by a vector rather than a NULL. Incidentally, I believe this is similar to the logic employed by Kasper with the BSseq class in the 'bsseq' package, which uses a GRanges object for the slot used to store the raw data as a GRanges object alongside a 'setClassUnion("matrixOrNULL", c("matrix", "NULL"))' slot for the model parameters, which are NULL when the BSseq object is created and are a matrix in the new BSseq object created by a modelling function ('BSmooth') when applied to the original BSseq object. I realise this might not be the proper way of doing things; for instance, there is a lot of seemingly redundant copying going on. I'm definitely open to suggestions of better ways of implementing this. Cheers, Pete -------------------------------- Peter Hickey, PhD Student/Research Assistant, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Ph: +613 9345 2324 hickey@wehi.edu.au http://www.wehi.edu.au ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:8}}
ADD REPLY
0
Entering edit mode
Hi Pete, On Thu, Jan 31, 2013 at 3:22 AM, <hickey at="" wehi.edu.au=""> wrote: [snip] > Here's what I'm trying to accomplish with a setClassUnion(). I've based my > class on the 'BSseq' class in the 'bsseq' package; partly for didactic > purposes (it always helps me to have something to work-off when getting > started with a new programming concept) and partly because I'm also dealing > with BS-seq data. My reason for using a 'setClassUnion("vectorOrNULL'", > c("vector", "NULL"))' is that for some of the slots the values aren't > defined (i.e. they are NULL) when my object is first created; rather they > are defined by a subsequent function call. > > In pseudo-code it goes something like this: >> library(GenomicRanges) >> setClassUnion("vectorOrNULL'", c("vector", "NULL")) >> setClass('MyClass", representation(gr = "GRanges", model_parameters = >> "vectorOrNULL")) >> in <- read_input_file(some_input_file) >> in_gr <- input_file_to_GRanges(in) # Convert 'in' to a GRanges object >> in_my_class <- MyClass(gr = in_gr, model_parameters = NULL) > ## At this point, is.null(in_my_class at model_parameters) == TRUE > ## Now, some modelling is done based on 'in_my_class'. > in_my_class_modelled <- modeling_function(in_my_class) ## This function > returns a MyClass object, with the 'model_parameters' slot now filled by a > vector of model parameters. > ## At this point, is.null(in_my_class at model_parameters) == TRUE, whereas, > is.null(in_my_class_modelled at model_parameters) == FALSE > > Basically, I'd really like to have the model_parameters in the same MyClass > object as the raw data (along with some other things, like sampleName etc.) > but these aren't available at the point when the MyClass object is first > constructed. The 'modelling_function()' returns a new MyClass object, with > the 'gr' slot copied from the input 'in_my_class' and the 'model_parameters' > slot filled by a vector rather than a NULL. I see what you're trying to do -- and I believe I've done this once or twice before myself (which is to use NULL as a place holder under a later call initializes the slot) ... luckily I somehow I never got kicked by the bug/behavior you found. Perhaps in this case, you can just set your `model_parameters` slot to `numeric`, and during the initialization/construction of your object, assign NA_real_ as its slot value, or more simply `numeric()` (a zero-length numeric vector). In this way, you still have a delimiter that indicates this slot is yet to be initialized w/ a proper variable, but you are still using an object of a `numeric` class to do so -- no class-union's necessary. Would that work for you? HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
Hi Steve, On 31/01/2013, at 11:40 PM, Steve Lianoglou wrote: > Hi Pete, > > On Thu, Jan 31, 2013 at 3:22 AM, <hickey@wehi.edu.au> wrote: > [snip] >> Here's what I'm trying to accomplish with a setClassUnion(). I've based my >> class on the 'BSseq' class in the 'bsseq' package; partly for didactic >> purposes (it always helps me to have something to work-off when getting >> started with a new programming concept) and partly because I'm also dealing >> with BS-seq data. My reason for using a 'setClassUnion("vectorOrNULL'", >> c("vector", "NULL"))' is that for some of the slots the values aren't >> defined (i.e. they are NULL) when my object is first created; rather they >> are defined by a subsequent function call. >> >> In pseudo-code it goes something like this: >>> library(GenomicRanges) >>> setClassUnion("vectorOrNULL'", c("vector", "NULL")) >>> setClass('MyClass", representation(gr = "GRanges", model_parameters = >>> "vectorOrNULL")) >>> in <- read_input_file(some_input_file) >>> in_gr <- input_file_to_GRanges(in) # Convert 'in' to a GRanges object >>> in_my_class <- MyClass(gr = in_gr, model_parameters = NULL) >> ## At this point, is.null(in_my_class@model_parameters) == TRUE >> ## Now, some modelling is done based on 'in_my_class'. >> in_my_class_modelled <- modeling_function(in_my_class) ## This function >> returns a MyClass object, with the 'model_parameters' slot now filled by a >> vector of model parameters. >> ## At this point, is.null(in_my_class@model_parameters) == TRUE, whereas, >> is.null(in_my_class_modelled@model_parameters) == FALSE >> >> Basically, I'd really like to have the model_parameters in the same MyClass >> object as the raw data (along with some other things, like sampleName etc.) >> but these aren't available at the point when the MyClass object is first >> constructed. The 'modelling_function()' returns a new MyClass object, with >> the 'gr' slot copied from the input 'in_my_class' and the 'model_parameters' >> slot filled by a vector rather than a NULL. > > I see what you're trying to do -- and I believe I've done this once or > twice before myself (which is to use NULL as a place holder under a > later call initializes the slot) ... luckily I somehow I never got > kicked by the bug/behavior you found. > > Perhaps in this case, you can just set your `model_parameters` slot to > `numeric`, and during the initialization/construction of your object, > assign NA_real_ as its slot value, or more simply `numeric()` (a > zero-length numeric vector). > > In this way, you still have a delimiter that indicates this slot is > yet to be initialized w/ a proper variable, but you are still using an > object of a `numeric` class to do so -- no class-union's necessary. > > Would that work for you? > > HTH, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact I think that's exactly the road I'll go down for the time being. The less I'm messing about with things I don't well understand, the better :) Cheers, Pete -------------------------------- Peter Hickey, PhD Student/Research Assistant, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Ph: +613 9345 2324 hickey@wehi.edu.au http://www.wehi.edu.au ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:8}}
ADD REPLY
0
Entering edit mode
Peter Hickey ▴ 740
@petehaitch
Last seen 7 days ago
WEHI, Melbourne, Australia
Bugger, I of course left off my session info. Here it is: > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GenomicRanges_1.10.6 IRanges_1.16.4 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] parallel_2.15.2 stats4_2.15.2 On 30/01/2013, at 8:57 PM, hickey@wehi.edu.au wrote: > There appear to be different methods available for 'Rle' when loaded via the GenomicRanges package depending on whether a GRanges object has been created. Specifically, prior to a GRanges object being created there are no 'values = character' methods for 'Rle'. This doesn't make sense to me and is causing me problems in code I am developing. > > The following code highlights the cause of my confusion: > > | > library(GenomicRanges) > | Loading required package: BiocGenerics > | > | Attaching package: ‘BiocGenerics’ > | > | The following object(s) are masked from ‘package:stats’: > | > | xtabs > | > | The following object(s) are masked from ‘package:base’: > | > | anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find, > | get, intersect, lapply, Map, mapply, mget, order, paste, pmax, > | pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int, > | rownames, sapply, setdiff, table, tapply, union, unique > > | Loading required package: IRanges > | > showMethods('Rle') > | Function: Rle (package IRanges) > | values="missing", lengths="missing" > | values="vectorORfactor", lengths="integer" > | values="vectorORfactor", lengths="missing" > | values="vectorORfactor", lengths="numeric" > > ## Only 4 methods are available for Rle > > | > seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1") > | > gr <- GRanges(seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)), > | + ranges = IRanges(1:10, width = 10:1, names = head(letters,10)), > | + strand = Rle(strand(c("-", "+", "*", "+", "-")),c(1, 2, 2, 3, 2)), > | + score = 1:10, GC = seq(1, 0, length=10), > | + seqinfo = seqinfo) > | > gr > | GRanges with 10 ranges and 2 metadata columns: > | seqnames ranges strand | score GC > | <rle> <iranges> <rle> | <integer> <numeric> > | a chr1 [ 1, 10] - | 1 1 > | b chr2 [ 2, 10] + | 2 0.888888888888889 > | c chr2 [ 3, 10] + | 3 0.777777777777778 > | d chr2 [ 4, 10] * | 4 0.666666666666667 > | e chr1 [ 5, 10] * | 5 0.555555555555556 > | f chr1 [ 6, 10] + | 6 0.444444444444444 > | g chr3 [ 7, 10] + | 7 0.333333333333333 > | h chr3 [ 8, 10] + | 8 0.222222222222222 > | i chr3 [ 9, 10] - | 9 0.111111111111111 > | j chr3 [10, 10] - | 10 0 > | --- > | seqlengths: > | chr1 chr2 chr3 > | 1000 2000 1500 > | > showMethods('Rle') > | Function: Rle (package IRanges) > | values="character", lengths="integer" > | (inherited from: values="vectorORfactor", lengths="integer") > | values="character", lengths="numeric" > | (inherited from: values="vectorORfactor", lengths="numeric") > | values="factor", lengths="integer" > | (inherited from: values="vectorORfactor", lengths="integer") > | values="factor", lengths="numeric" > | (inherited from: values="vectorORfactor", lengths="numeric") > | values="missing", lengths="missing" > | values="vectorORfactor", lengths="integer" > | values="vectorORfactor", lengths="missing" > | values="vectorORfactor", lengths="numeric" > > ## Now, there are 8 methods available for Rle > > Is this a bug or am I missing something? If I'm just missing something, can someone please explain how I can ensure that the methods involving 'values = character' are available to me upon loading of the GenomicRanges package? > > Many thanks, > Pete > > -------------------------------- > Peter Hickey, > PhD Student/Research Assistant, > Bioinformatics Division, > Walter and Eliza Hall Institute of Medical Research, > 1G Royal Parade, Parkville, Vic 3052, Australia. > Ph: +613 9345 2324 > > hickey@wehi.edu.au > http://www.wehi.edu.au > -------------------------------- Peter Hickey, PhD Student/Research Assistant, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Ph: +613 9345 2324 hickey@wehi.edu.au http://www.wehi.edu.au ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:8}}
ADD COMMENT

Login before adding your answer.

Traffic: 948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6