Entering edit mode
Subsetting a GRanges with ==
works:
granges <- GenomicRanges::GRanges(
c('chr1', 'chr2'),
'100-200',
seqinfo = GenomeInfoDb::seqinfo(BSgenome.Mmusculus.UCSC.mm10::Mmusculus))
subset(granges, seqnames == 'chr1')
But subsetting with %in%
fails:
subset(granges, seqnames %in% c('chr1', 'chr3'))
Error in match(x, table, nomatch = 0L) : 'match' requires vector arguments
Thank you Michael :-). Sorry for forgetting
require(magrittr)
. Question now updated withoutmagrittr
to keep it simple. What would be a clean-namespace alternative torequire(GenomicRanges)
forGRanges
subsetting?You could always do something like
IRanges::`%in%`(x, y)
. Keeping the workspace clean can be a pain though, because you'll constantly have to remember the origin of each function.I see, IRanges is the place where it lives - thanks!
(Michael, merging my answer - based on your comments - into your's and then deleting mine could keep things clean for future reference)
Nope,
%in%
doesn't live in IRanges. IRanges used to define S4 methods for%in%
but not anymore. The fact that IRanges still exports%in%
is a leftover from a long time ago. That needs to be removed. The S4Vectors package defines the%in%
method for Vector derivatives. Note that%in%
is an implicit generic. This means that we don't have asetGeneric()
statement somewhere that promotes it to an S4 generic function. It just gets automatically promoted on the firstsetMethod
statement.A cleaner situation would be to define the
%in%
generic with an explicitsetGeneric
statement in BiocGenerics. We will do this soon. Then all you need to do is import the generic from BiocGenerics in your NAMESPACE. Then just use%in%
normally. This will call the generic, which takes care of dispatching to the appropriate method. Always call the generic. Trying to call a particular method is not robust.So no need to do things like
S4Vectors::`%in%`(x, y)
orBiocGenerics::`%in%`(x, y)
. If you fully import BiocGenerics and S4Vectors (withimport(BiocGenerics)
andimport(S4Vectors)
), which I strongly recommend, then you can start using%in%
now without having to worry where the generic is defined.H.
Oh, that's illuminating - thx!
Is the following understanding correct?
BioCgenerics
contains all generics required by BioC classes. When a need for a new generic arises, it is explicitated in this package and exported. The idiom is the@import BiocGenerics
, and then one is free to play.S4Vectors::Vector
is the fundamental BioC datatype. Other S4 classes, e.g. GenomicRanges, SummarizedExperiment, etc. inherit from Vector and build further on that. Therefore, the idiom is to also@import S4Vectors
and then the BioC playfield lays wide open.Not exactly as there are generics specific to certain data types defined by the packages you mention and many others. Unless you are using a package for a very specific reason (like a single utility function), it's simplest to just import the package in bulk.
I see - it looks like I have to switch idioms then.
My default idiom has been to maintain a fully clean namespace, knowing exactly which function from which package is being used, being 100 % sure that no namespace clashes can occur.
But it looks like I need to drop that paradigm in the S4 world...
Would the following statement be right then?
On the one hand
BiocGenerics
exports generics common to multiple BioC packages.On the other hand,
GenomicRanges
exportsGRanges
-specific generics,SummarizedExperiment
exportsSummarizedExperiment
-specific generics, etc.Yes, that's the idea. See
?BiocGenerics
for the 2 kinds of generics we define in the BiocGenerics package. Also note that the proteomics folks define their own set of S4 generics in the ProtGenerics package.Thanks Herve! So the apparant namespace clashes upon
require(BiocGenerics)
are actually intentional promotions of base primitives and S3 generics to S4 generics?Exactly.
FYI
%in%
is now an explicit S4 generic defined in the BiocGenerics package: https://github.com/Bioconductor/BiocGenerics/commit/30d0813751536f0dd03b36d1f90c116e889d1954H.
Thanks Herve, also for providing the update :-)