It is possible to use Gviz without a valid UCSC genome?
2
0
Entering edit mode
@vinicius-henrique-da-silva-6713
Last seen 10 months ago
Brazil

I would like to use the Gviz package but my I have a reference genome in FASTA format, not as a valid UCSC genome (it was not published yet).

It is possible to use this FASTA file instead?

Updated for @Florian:
Let´s say that I have one genome in FASTA format on disk:

library(Biostrings)
ncrna <- readDNAStringSet(file = "GTgenome.fa")

head(ncrna)
  A DNAStringSet instance of length 6
       width seq                                            names
[1] 20202851 CCCAGTTTTCCCCACTCTGTGA...AAAGATCTTACAACCGATTTT chr10 Major...
[2] 20315886 AGCCGACGAGACTCACAGAACC...TCACAAACCCCCTCGGGAGGG chr11 Major...
[3] 20466350 GATTAGACCTCCGAAAGGGGTA...ATTATTAATTATTAAATATTA chr12 Major...
[4] 16480340 GTCTCCACTTGCCCCACAACGG...AGATGACGATGATGAAGATGA chr13 Major...
[5] 16193477 CTCTGTGACATCACAGCCATGG...GGGTTACACACGTTGTTTTTT chr14 Major...

Using the object created to replace a UCSC genome:

ideoTrack <- IdeogramTrack(genome=ncrna, chromosome="chr10", fontsize=14)

Error in .Call2("new_XStringSet_from_CHARACTER", ans_class, ans_elementType,  :
  key 51 (char '3') not in lookup table
In addition: Warning message:
In if (!token %in% base::ls(env)) { :
  the condition has length > 1 and only the first element will be used

I am probably misunderstanding a very basic feature here, but I would be grateful for some help!

 

gviz UCSC • 2.6k views
ADD COMMENT
1
Entering edit mode
@florianhahnenovartiscom-3784
Last seen 5.6 years ago
Switzerland

Gviz per se does not care about the exact nature of your reference genome. There are a couple of features that are only available out of the box for UCSC genomes, like bands in IdeogramTracks, albeit there are ways to make that work for custom genomes, too. If you want to include the reference sequence in a SequenceTrack, you can do that by first reading in your fasta file as a DNAStringSet using the readDNAStringSet() function in Biostrings.

If this is not what you are after you will have to give a bit more detailed explanation of what you are trying here, ideally with some reproducible code.

Florian

ADD COMMENT
0
Entering edit mode

Florian, please check my update!

0
Entering edit mode

Florian, please check my update!

0
Entering edit mode
@florianhahnenovartiscom-3784
Last seen 5.6 years ago
Switzerland

The DNAStringSet remark was to set up SequenceTrack objects

The user-provided data the IdeogramTrack needs to be in the from of a data.frame, containing the cytoband information.See the band argument in the IdeogramTrack documentation:

  bands: A ‘data.frame’ with the cytoband information for all

          available chromosomes on the genome similar to the data that

          would be fetched from UCSC. The table needs to contain the

          mandatory columns ‘chrom’, ‘chromStart’, ‘chromEnd’, ‘name’

          and ‘gieStain’ with the chromosome name, cytoband start and

          end coordinates, cytoband name and coloring information,

          respectively. This can be used when no connection to the

          internet is available or when the cytoband information has

          been cached locally to avoid the somewhat slow connection to

          UCSC.

 

ADD COMMENT

Login before adding your answer.

Traffic: 1009 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6