error when coercing GenomeData to RangedData: 'no method or default for coercing “XStringViews” to “RangedData”'
1
0
Entering edit mode
Ludo Pagie ▴ 40
@ludo-pagie-6130
Last seen 8.9 years ago
Hi all, I'm getting an error when trying to coerce a GenomeData object to a RangedData object. Here's the code I have used together with some output (in particular the error msg when coercing): ###################################################################### # import human genome sequence library(BSgenome.Hsapiens.UCSC.hg19) # Virtual digest of the entire genome in GATC fragments: # matchPattern to find GATC motifs, using bsapply ExtractGATCFragments <- function(chr) { # function for finding GATC sites in a chromosome and returning the fragments # starting and ending with GATC GATC.match <- matchPattern(chr, pattern='GATC') start <- c(1, start(GATC.match)) # not sure what happens if chromosome starts with GATC end <- c(end(GATC.match), length(chr)) Views(subject=unmasked(chr),start=start, end=end) } pm <- new('BSParams', X=Hsapiens, FUN=function(chr) ExtractGATCFragments(chr) ) # create the GenomeData object: GATC.fragments <- bsapply(pm) # is it a genomeData?? class(GATC.fragments) # [1] "GenomeData" # attr(,"package") # [1] "BSgenome" # coerce it to a RangedData: as(GATC.fragments, "RangedData") # Error in FUN(X[[1L]], ...) : # no method or default for coercing ?XStringViews? to ?RangedData? > sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=nl_NL.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] BSgenome.Hsapiens.UCSC.hg19_1.3.19 BSgenome_1.30.0 [3] Biostrings_2.30.1 GenomicRanges_1.14.4 [5] XVector_0.2.0 IRanges_1.20.6 [7] BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] stats4_3.0.2 tools_3.0.2 ###################################################################### Can anybody point out to me why the object can not be coerced. Thanks, Ludo
BSgenome BSgenome BSgenome BSgenome • 1.5k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States
First off, I think you can simplify your function to: ExtractGATCFragments <- function(chr) { gaps(matchPattern(chr, pattern='GATC')) } But to get to the result you might want, instead of the bsapply and RangedData stuff, just do: gr <- vmatchPDict("GATC", Hsapiens) The 'gr' represents the ranges of GATC in the genome. It matches on both strands, which is not useful for you since GATC palindromic, so it might be slower than you want. This also assumes you're not interested in keeping the genomic sequence around (it's big, and why?). But anyway, just do this to get your result: ans <- subset(gaps(gr), strand=="+") # may require updating your R/Bioc But if you do want to use bsapply, you should at least coerce to GRanges, not RangedData, which is not suitable for this type of data. Unfortunately, GenomeData is a dinosaur and there is no direct coercion to GRanges, but RangesList is a link to the past: gr <- as(as(GATC.fragments, "RangesList"), "GRanges") Really what we need for this use case is an lapply method for BSgenome, then we would just coerce to List, which would result in a RangesList, assuming the user function returned Ranges, and then coerce to GRanges. And/or clean up vmatchPDict,BSgenome so that it has an option to match only to the positive strand. Michael On Wed, May 7, 2014 at 8:05 AM, Ludo Pagie <ludo.pagie@gmail.com> wrote: > Hi all, > > I'm getting an error when trying to coerce a GenomeData object to a > RangedData object. Here's the code I have used together with some > output (in particular the error msg when coercing): > > > ###################################################################### > # import human genome sequence > library(BSgenome.Hsapiens.UCSC.hg19) > > # Virtual digest of the entire genome in GATC fragments: > # matchPattern to find GATC motifs, using bsapply > ExtractGATCFragments <- function(chr) { > # function for finding GATC sites in a chromosome and returning the > fragments > # starting and ending with GATC > GATC.match <- matchPattern(chr, pattern='GATC') > start <- c(1, start(GATC.match)) # not sure what happens if > chromosome starts with GATC > end <- c(end(GATC.match), length(chr)) > Views(subject=unmasked(chr),start=start, end=end) > } > pm <- new('BSParams', X=Hsapiens, FUN=function(chr) > ExtractGATCFragments(chr) ) > # create the GenomeData object: > GATC.fragments <- bsapply(pm) > > # is it a genomeData?? > class(GATC.fragments) > # [1] "GenomeData" > # attr(,"package") > # [1] "BSgenome" > > # coerce it to a RangedData: > as(GATC.fragments, "RangedData") > # Error in FUN(X[[1L]], ...) : > # no method or default for coercing “XStringViews” to “RangedData” > > > > > sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=nl_NL.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] BSgenome.Hsapiens.UCSC.hg19_1.3.19 BSgenome_1.30.0 > [3] Biostrings_2.30.1 GenomicRanges_1.14.4 > [5] XVector_0.2.0 IRanges_1.20.6 > [7] BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] stats4_3.0.2 tools_3.0.2 > ###################################################################### > > Can anybody point out to me why the object can not be coerced. > > Thanks, Ludo > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 1607 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6