Views on BSgenome object
2
1
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 9 months ago
United States

If I have a GRanges, I was expecting to be able to instantiate a Views like

 ntv <- Views(Mmusculus, granges(BS))
 Error in (function (classes, fdef, mtable)  :
   unable to find an inherited method for function ‘Views’ for signature ‘"BSgenome"’

I assume this is because I need to instantiate a separate Views on each chromosome.  That is tedious.  Is there an easy fix?

Best,
Kasper

BSgenome • 2.0k views
ADD COMMENT
1
Entering edit mode
@herve-pages-1542
Last seen 4 hours ago
Seattle, WA, United States

Hi Kasper,

Yes instantiating a separate Views on each chromosome would be tedious. I just added the BSgenomeViews class to BSgenome 1.35.6 (this is in BioC devel) so now you can do:

library(BSgenome.Mmusculus.UCSC.mm10)
genome <- BSgenome.Mmusculus.UCSC.mm10

library(TxDb.Mmusculus.UCSC.mm10.knownGene)
txdb <- TxDb.Mmusculus.UCSC.mm10.knownGene
ex <- exons(txdb, columns=c("tx_name", "gene_id"))

v <- Views(genome, ex)

Then:

> v
BSgenomeViews object with 254827 views and 2 metadata columns:
           seqnames               ranges strand                       dna   |
              <Rle>            <IRanges>  <Rle>            <DNAStringSet>   |
       [1]     chr1   [4807893, 4807982]      + [GCACTGTCCG...CACCGCCGCG]   |
       [2]     chr1   [4808455, 4808486]      + [GTTATTTTCC...GAGATACAGG]   |
       [3]     chr1   [4828584, 4828649]      + [GCATGGATGG...GTCCACATGC]   |
       [4]     chr1   [4830268, 4830315]      + [CCCTGTGATG...TGCCTTCTTG]   |
       [5]     chr1   [4832311, 4832381]      + [GTTTGATATC...GCAGAAACCG]   |
       ...      ...                  ...    ...                       ... ...
  [254823]     chrY [90531602, 90531635]      - [GCGGTTGGGA...TAGATGAGCT]   |
  [254824]     chrY [90603501, 90603569]      - [GTTTGAAGTC...CTCATCATCG]   |
  [254825]     chrY [90604098, 90604282]      - [ACCCAGGAGA...ACTGCATGAG]   |
  [254826]     chrY [90605010, 90605191]      - [AGACATCCTG...TGATGAAGGA]   |
  [254827]     chrY [90605679, 90605864]      - [GGCCTTGCTC...TCTCCAGAAG]   |
                         tx_name         gene_id
                 <CharacterList> <CharacterList>
       [1] uc007afg.1,uc007afh.1           18777
       [2] uc007afg.1,uc007afh.1           18777
       [3] uc007afg.1,uc007afh.1           18777
       [4] uc007afg.1,uc007afh.1           18777
       [5] uc007afg.1,uc007afh.1           18777
       ...                   ...             ...
  [254823]            uc029ygu.1              NA
  [254824]            uc029ygv.1              NA
  [254825]            uc029ygv.1              NA
  [254826]            uc029ygv.1              NA
  [254827]            uc029ygv.1              NA
  -------
  seqinfo: 66 sequences (1 circular) from mm10 genome

This is just a BSgenome object and a GRanges object bundle together. You can get them back with:

subject(v)
granges(v)

You can use [[ to extract a given sequence:

> v[[2]]
  32-letter "DNAString" instance
seq: GTTATTTTCCTTCACGGATTGGGAGATACAGG

or coerce to DNAStringSet to extract them all:

as(v, "DNAStringSet")  # might take a while

Some of the most frequently used DNAStringSet methods work on v:

> alphabetFrequency(v, collapse=TRUE)
       A        C        G        T        M        R        W        S 
21951343 20431781 20686000 21448065        0        0        0        0 
       Y        K        V        H        D        B        N        - 
       0        0        0        0        0        0        2        0 
       +        . 
       0        0 

More could be added...

Will add a man page for this new class. Let me know on the bioc-devel mailing list how it goes for you.

H.

ADD COMMENT
0
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 9 months ago
United States

Herve,

Thanks a lot (that was quick!), I will test this in the next couple of days when it has propagated to the build system.

Best,
Kasper

ADD COMMENT

Login before adding your answer.

Traffic: 853 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6