Gviz - Cannot plot exons from gencode GTF files
3
1
Entering edit mode
@romainfenouil-9584
Last seen 8.3 years ago

Hello,

I am trying to use GViz for plotting genes annotations from GTF/GFF files. 

It works nicely with files downloaded from UCSC table browser (refflat, refseq, gencode). However I could not find a way to download a file that provides simultaneous genes and transcripts information. All the lines share the same 'gene_id' and 'transcript_id'.

When I download annotations directly from gencode website, it seems that these file carry more information (genes/transcripts/symbols...) : http://www.gencodegenes.org/releases/current.html

However, while the latter ones are correctly loaded and parsed by the constructor 'GeneRegionTrack', the plotting step does not resolve the internal gene structure (just a big box, no exons or thin boxes for non coding...). I tried playin with the thinBoxFeature in vain.

Has anybody experience the same ?

Using the gencode files would be very helpful because it allows to use ensembl IDs or genes symbol directly for selecting genes/transcripts. My question probably reflects my misunderstanding of UCSC/Gencode or GFF/GTF file formats. Any help would be appreciated :)

On a side question, is GViz package still maintained ? Or should I ask my questions about it on a more adapted forum/list ?

 

Thank you for your help.

Romain.

gviz • 4.3k views
ADD COMMENT
2
Entering edit mode
Robert Ivanek ▴ 730
@robert-ivanek-5892
Last seen 5 months ago
Switzerland

I found this hack how to get the correct plot. I am no so sure why the direct import does not work so well but I guess the format of GFF3 from GenCode and one from UCSC is not exactly the same.

library(rtracklayer)
library(GenomicFeatures)
library(Gviz)
# using the rtracklayer import function I can read the GFF3 from GenCode and create conversion table between gene id and gene symbol
gff3 <- import.gff3("gencode.v24.annotation.gff3.gz")
gene2symbol <- mcols(gff3)[,c("gene_id","gene_name")]
gene2symbol <- unique(gene2symbol)
rownames(gene2symbol) <- gene2symbol$gene_id

# the GFF3 from GenCode can be parsed into TxDb object and after creating the GeneRegionTrack I could set the correct gene symbols using the table from previous step
txdb <- makeTxDbFromGFF("gencode.v24.annotation.gff3.gz")
geneTrack <- GeneRegionTrack(txdb, chromosome="chr7", from=5527151, to=5563784)
ranges(geneTrack)$symbol <- gene2symbol[ranges(geneTrack)$gene, "gene_name"]

# plotting with gene symbols
plotTracks(geneTrack, chromosome="chr7", from=5527151, to=5563784, showId=TRUE)

 

ADD COMMENT
0
Entering edit mode
@romainfenouil-9584
Last seen 8.3 years ago

Hello,

yes there is indeed some minor differences between both formats. Thank you very much for the workaround, that is a great idea !

Romain.

ADD COMMENT
0
Entering edit mode
@romainfenouil-9584
Last seen 8.3 years ago

Hello,

yes there is indeed some minor differences between both formats. Thank you very much for the workaround, that is a great idea !

Romain.

ADD COMMENT

Login before adding your answer.

Traffic: 628 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6