If I got it correctly you want to plot the transcript biotype instead of the transcript name? If yes, you could use the ensembldb package:
library(ensembldb)
library(EnsDb.Hsapiens.v75)
edb <- EnsDb.Hsapiens.v75
## Get the GRanges in the correct format for Gviz for a gene, e.g. BCL2L11
gr <- getGeneRegionTrackForGviz(edb, filter = GenenameFilter("BCL2L11"))
> gr
GRanges object with 89 ranges and 6 metadata columns:
       seqnames                 ranges strand |        feature            gene
          <Rle>              <IRanges>  <Rle> |    <character>     <character>
   [1]        2 [111878491, 111878765]      + |           utr5 ENSG00000153094
   [2]        2 [111881310, 111881322]      + |           utr5 ENSG00000153094
   [3]        2 [111878491, 111878765]      + |           utr5 ENSG00000153094
   ...      ...                    ...    ... .            ...             ...
  [87]        2 [111881323, 111881446]      + | protein_coding ENSG00000153094
  [88]        2 [111907621, 111907724]      + | protein_coding ENSG00000153094
  [89]        2 [111918996, 111919016]      + | protein_coding ENSG00000153094
                  exon exon_rank      transcript      symbol
           <character> <integer>     <character> <character>
   [1] ENSE00001514631         1 ENST00000308659     BCL2L11
   [2] ENSE00001418123         2 ENST00000308659     BCL2L11
   [3] ENSE00001514631         1 ENST00000337565     BCL2L11
   ...             ...       ...             ...         ...
  [87] ENSE00001357671         1 ENST00000452231     BCL2L11
  [88] ENSE00003483971         2 ENST00000452231     BCL2L11
  [89] ENSE00003596136         3 ENST00000452231     BCL2L11
  -------
  seqinfo: 1 sequence from GRCh37 genome
## Now we want to get the transcript biotype for these transcripts; we
## want to get the results as a data.frame instead of a GRanges here
biotypes <- transcripts(edb, filter = TxidFilter(gr$transcript), return.type = "data.frame")
rownames(biotypes) <- biotypes$tx_id
## Replace symbol with biotype
gr$symbol <- biotypes[gr$transcript, "tx_biotype"]
library(Gviz)
plotTracks(GeneRegionTrack(gr, name = "BCL2L11"), transcriptAnnotation = "symbol")
 
Alternatively, you could plot separate tracks for protein coding and nonsense mediated decay:
prot_cod <- getGeneRegionTrackForGviz(edb, filter = list(GenenameFilter("BCL2L11"), TxbiotypeFilter("protein_coding")))
nonsense <-getGeneRegionTrackForGviz(edb, filter = list(GenenameFilter("BCL2L11"), TxbiotypeFilter("nonsense_mediated_decay")))
plotTracks(list(GeneRegionTrack(prot_cod, name = "protein_coding"), GeneRegionTrack(nonsense, name = "nonsense_mediated_decay")))
 
Hope that helps,
cheers, jo
                    
                
                 
Thank you for helping me. Indeed, this method works. But can you explain to me why EnsDb.Mmusculus.v79 for the gene "Rabggtb", I only got 2 isoforms (protein coding). When actually, at the website ensembl there is more than 10 isoforms(non sense mediated decay) and others. It seems to me that EnsDb.Mmusculus.v79 is outdated in relation to the release 85, it is sad that I could not find any EnsDb.Mmusculus.v85 available.
Yes, I've only packaged some releases from Ensembl for some species. But it's easy to build newer EnsDb packages on your own:
> library(ensembldb) > library(AnnotationHub) > ah <- AnnotationHub() > ## Check what's available for Mus musculus and Ensembl release 85: > query(ah, c("Mus musculus", "release-85", "gtf")) AnnotationHub with 4 records # snapshotDate(): 2016-08-15 # $dataprovider: Ensembl # $species: Mus musculus # $rdataclass: GRanges # additional mcols(): taxonomyid, genome, description, tags, sourceurl, # sourcetype # retrieve records with, e.g., 'object[["AH51037"]]' title AH51037 | Mus_musculus.GRCm38.85.abinitio.gtf AH51038 | Mus_musculus.GRCm38.85.chr.gtf AH51039 | Mus_musculus.GRCm38.85.chr_patch_hapl_scaff.gtf AH51040 | Mus_musculus.GRCm38.85.gtf > ## The last one is the GTF file we're looking for: provides the gene annotations for Ensembl 85 > my_gtf <- ah["AH51040"] > ## Now build the EnsDb database from this GTF file: > db_file <- ensDbFromAH(my_gtf) > ## You can then load and use this database directly > edb <- EnsDb(db_file) > transcripts(edb, filter = GenenameFilter("Rabggtb")) GRanges object with 12 ranges and 5 metadata columns: seqnames ranges strand | <Rle> <IRanges> <Rle> | ENSMUST00000197438 3 [153907287, 153912311] - | ENSMUST00000167111 3 [153907289, 153912145] - | ENSMUST00000089950 3 [153907289, 153913009] - | ... ... ... ... . ENSMUST00000197829 3 [153908075, 153912033] - | ENSMUST00000198094 3 [153909414, 153912955] - | ENSMUST00000197147 3 [153911986, 153912931] - | tx_id tx_biotype <character> <character> ENSMUST00000197438 ENSMUST00000197438 nonsense_mediated_decay ENSMUST00000167111 ENSMUST00000167111 protein_coding ENSMUST00000089950 ENSMUST00000089950 protein_coding ... ... ... ENSMUST00000197829 ENSMUST00000197829 nonsense_mediated_decay ENSMUST00000198094 ENSMUST00000198094 nonsense_mediated_decay ENSMUST00000197147 ENSMUST00000197147 processed_transcript tx_cds_seq_start tx_cds_seq_end gene_id <numeric> <numeric> <character> ENSMUST00000197438 153910625 153912059 ENSMUSG00000038975 ENSMUST00000167111 153907601 153912059 ENSMUSG00000038975 ENSMUST00000089950 153907601 153912936 ENSMUSG00000038975 ... ... ... ... ENSMUST00000197829 153910072 153912033 ENSMUSG00000038975 ENSMUST00000198094 153910139 153912955 ENSMUSG00000038975 ENSMUST00000197147 <NA> <NA> ENSMUSG00000038975 ------- seqinfo: 1 sequence from GRCm38 genomeSo, that way you get all of your transcripts. I suggest also that you build a
EnsDbpackage from the generated SQLite file using themakeEnsembldbPackage. That way you can always load your annotation as an R package. See the vignette from theensembldbpackage for more information.cheers, jo
Thank you very much!
More one thing, I want to plot only sections of a gene. I got this problem earlier with ensembl genes, when I also was trying to plot the names here but gviz only accept if the whole feature is showing up. Can this be done?
That I don't know, but I suggest you post this as a new question.
Sure, you can control that via the just.group parameter:
plotTracks(GeneRegionTrack(gr, name = "BCL2L11"), transcriptAnnotation = "symbol", from=111881310, just.group = "above")
thank you florian!