Search
Question: error when coercing GenomeData to RangedData: 'no method or default for coercing “XStringViews” to “RangedData”'
0
gravatar for Ludo Pagie
3.5 years ago by
Ludo Pagie40
Ludo Pagie40 wrote:
Hi all, I'm getting an error when trying to coerce a GenomeData object to a RangedData object. Here's the code I have used together with some output (in particular the error msg when coercing): ###################################################################### # import human genome sequence library(BSgenome.Hsapiens.UCSC.hg19) # Virtual digest of the entire genome in GATC fragments: # matchPattern to find GATC motifs, using bsapply ExtractGATCFragments <- function(chr) { # function for finding GATC sites in a chromosome and returning the fragments # starting and ending with GATC GATC.match <- matchPattern(chr, pattern='GATC') start <- c(1, start(GATC.match)) # not sure what happens if chromosome starts with GATC end <- c(end(GATC.match), length(chr)) Views(subject=unmasked(chr),start=start, end=end) } pm <- new('BSParams', X=Hsapiens, FUN=function(chr) ExtractGATCFragments(chr) ) # create the GenomeData object: GATC.fragments <- bsapply(pm) # is it a genomeData?? class(GATC.fragments) # [1] "GenomeData" # attr(,"package") # [1] "BSgenome" # coerce it to a RangedData: as(GATC.fragments, "RangedData") # Error in FUN(X[[1L]], ...) : # no method or default for coercing ?XStringViews? to ?RangedData? > sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=nl_NL.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] BSgenome.Hsapiens.UCSC.hg19_1.3.19 BSgenome_1.30.0 [3] Biostrings_2.30.1 GenomicRanges_1.14.4 [5] XVector_0.2.0 IRanges_1.20.6 [7] BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] stats4_3.0.2 tools_3.0.2 ###################################################################### Can anybody point out to me why the object can not be coerced. Thanks, Ludo
ADD COMMENTlink modified 3.5 years ago by Michael Lawrence9.8k • written 3.5 years ago by Ludo Pagie40
0
gravatar for Michael Lawrence
3.5 years ago by
United States
Michael Lawrence9.8k wrote:
First off, I think you can simplify your function to: ExtractGATCFragments <- function(chr) { gaps(matchPattern(chr, pattern='GATC')) } But to get to the result you might want, instead of the bsapply and RangedData stuff, just do: gr <- vmatchPDict("GATC", Hsapiens) The 'gr' represents the ranges of GATC in the genome. It matches on both strands, which is not useful for you since GATC palindromic, so it might be slower than you want. This also assumes you're not interested in keeping the genomic sequence around (it's big, and why?). But anyway, just do this to get your result: ans <- subset(gaps(gr), strand=="+") # may require updating your R/Bioc But if you do want to use bsapply, you should at least coerce to GRanges, not RangedData, which is not suitable for this type of data. Unfortunately, GenomeData is a dinosaur and there is no direct coercion to GRanges, but RangesList is a link to the past: gr <- as(as(GATC.fragments, "RangesList"), "GRanges") Really what we need for this use case is an lapply method for BSgenome, then we would just coerce to List, which would result in a RangesList, assuming the user function returned Ranges, and then coerce to GRanges. And/or clean up vmatchPDict,BSgenome so that it has an option to match only to the positive strand. Michael On Wed, May 7, 2014 at 8:05 AM, Ludo Pagie <ludo.pagie@gmail.com> wrote: > Hi all, > > I'm getting an error when trying to coerce a GenomeData object to a > RangedData object. Here's the code I have used together with some > output (in particular the error msg when coercing): > > > ###################################################################### > # import human genome sequence > library(BSgenome.Hsapiens.UCSC.hg19) > > # Virtual digest of the entire genome in GATC fragments: > # matchPattern to find GATC motifs, using bsapply > ExtractGATCFragments <- function(chr) { > # function for finding GATC sites in a chromosome and returning the > fragments > # starting and ending with GATC > GATC.match <- matchPattern(chr, pattern='GATC') > start <- c(1, start(GATC.match)) # not sure what happens if > chromosome starts with GATC > end <- c(end(GATC.match), length(chr)) > Views(subject=unmasked(chr),start=start, end=end) > } > pm <- new('BSParams', X=Hsapiens, FUN=function(chr) > ExtractGATCFragments(chr) ) > # create the GenomeData object: > GATC.fragments <- bsapply(pm) > > # is it a genomeData?? > class(GATC.fragments) > # [1] "GenomeData" > # attr(,"package") > # [1] "BSgenome" > > # coerce it to a RangedData: > as(GATC.fragments, "RangedData") > # Error in FUN(X[[1L]], ...) : > # no method or default for coercing “XStringViews” to “RangedData” > > > > > sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=nl_NL.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] BSgenome.Hsapiens.UCSC.hg19_1.3.19 BSgenome_1.30.0 > [3] Biostrings_2.30.1 GenomicRanges_1.14.4 > [5] XVector_0.2.0 IRanges_1.20.6 > [7] BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] stats4_3.0.2 tools_3.0.2 > ###################################################################### > > Can anybody point out to me why the object can not be coerced. > > Thanks, Ludo > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENTlink written 3.5 years ago by Michael Lawrence9.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 395 users visited in the last hour