Within the gene annotations provided in TxDb.Mmusculus.UCSC.mm9.knownGene there appears to be a snoRNA which is almost 36Mb long, however in reality it is only 56bp long. I've used makeTranscriptFromUCSC as well and the same thing happens.
Any ideas why this is happening? Or is there a fix for this?
> library(TxDb.Mmusculus.UCSC.mm9.knownGene)
> ucsc.genes = genes(TxDb.Mmusculus.UCSC.mm9.knownGene)
> ucsc.genes[which(width(ucsc.genes) == max(width(ucsc.genes)))]
GRanges object with 1 range and 1 metadata column:
seqnames ranges strand | gene_id
<Rle> <IRanges> <Rle> | <character>
100217439 chr19 [25016728, 60850288] - | 100217439
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] XVector_0.6.0
[2] TxDb.Mmusculus.UCSC.mm9.knownGene_3.0.0
[3] GenomicFeatures_1.18.3
[4] AnnotationDbi_1.28.1
[5] Biobase_2.26.0
[6] GenomicRanges_1.18.4
[7] GenomeInfoDb_1.2.4
[8] IRanges_2.0.1
[9] S4Vectors_0.4.0
[10] BiocGenerics_0.12.1
loaded via a namespace (and not attached):
[1] base64enc_0.1-2 BatchJobs_1.5 BBmisc_1.8
[4] BiocParallel_1.0.0 biomaRt_2.22.0 Biostrings_2.34.1
[7] bitops_1.0-6 brew_1.0-6 checkmate_1.5.1
[10] codetools_0.2-9 DBI_0.3.1 digest_0.6.8
[13] fail_1.2 foreach_1.4.2 GenomicAlignments_1.2.1
[16] iterators_1.0.7 RCurl_1.95-4.5 Rsamtools_1.18.2
[19] RSQLite_1.0.0 rtracklayer_1.26.2 sendmailR_1.2-1
[22] stringr_0.6.2 tools_3.1.1 XML_3.98-1.1
[25] zlibbioc_1.12.0