It is possible to use Gviz without a valid UCSC genome?
2
0
Entering edit mode
@vinicius-henrique-da-silva-6713
Last seen 15 months ago
Brazil

I would like to use the Gviz package but my I have a reference genome in FASTA format, not as a valid UCSC genome (it was not published yet).

It is possible to use this FASTA file instead?

Updated for @Florian:
Let´s say that I have one genome in FASTA format on disk:

library(Biostrings)

A DNAStringSet instance of length 6
width seq                                            names
[1] 20202851 CCCAGTTTTCCCCACTCTGTGA...AAAGATCTTACAACCGATTTT chr10 Major...
[2] 20315886 AGCCGACGAGACTCACAGAACC...TCACAAACCCCCTCGGGAGGG chr11 Major...
[3] 20466350 GATTAGACCTCCGAAAGGGGTA...ATTATTAATTATTAAATATTA chr12 Major...
[4] 16480340 GTCTCCACTTGCCCCACAACGG...AGATGACGATGATGAAGATGA chr13 Major...
[5] 16193477 CTCTGTGACATCACAGCCATGG...GGGTTACACACGTTGTTTTTT chr14 Major...

Using the object created to replace a UCSC genome:

ideoTrack <- IdeogramTrack(genome=ncrna, chromosome="chr10", fontsize=14)

Error in .Call2("new_XStringSet_from_CHARACTER", ans_class, ans_elementType,  :
key 51 (char '3') not in lookup table
In if (!token %in% base::ls(env)) { :
the condition has length > 1 and only the first element will be used

I am probably misunderstanding a very basic feature here, but I would be grateful for some help!

gviz UCSC • 1.7k views
1
Entering edit mode
@florianhahnenovartiscom-3784
Last seen 2.9 years ago
Switzerland

Gviz per se does not care about the exact nature of your reference genome. There are a couple of features that are only available out of the box for UCSC genomes, like bands in IdeogramTracks, albeit there are ways to make that work for custom genomes, too. If you want to include the reference sequence in a SequenceTrack, you can do that by first reading in your fasta file as a DNAStringSet using the readDNAStringSet() function in Biostrings.

If this is not what you are after you will have to give a bit more detailed explanation of what you are trying here, ideally with some reproducible code.

Florian

0
Entering edit mode

0
Entering edit mode

0
Entering edit mode
@florianhahnenovartiscom-3784
Last seen 2.9 years ago
Switzerland

The DNAStringSet remark was to set up SequenceTrack objects

The user-provided data the IdeogramTrack needs to be in the from of a data.frame, containing the cytoband information.See the band argument in the IdeogramTrack documentation:

bands: A ‘data.frame’ with the cytoband information for all

available chromosomes on the genome similar to the data that

would be fetched from UCSC. The table needs to contain the

mandatory columns ‘chrom’, ‘chromStart’, ‘chromEnd’, ‘name’

and ‘gieStain’ with the chromosome name, cytoband start and

end coordinates, cytoband name and coloring information,

respectively. This can be used when no connection to the

internet is available or when the cytoband information has

been cached locally to avoid the somewhat slow connection to

UCSC.