Hi,
I have a list of loci that I would like to annotate with locateVariants function. However, the function does not annotate some variants I am interested in. Below I put an example for the locus chr5:1599742 (genome hg19). I know from UCSC that there is a gene "LOC728613" and a few transcripts. But the locateVariants function does not find it.
I run the code from the fresh R session:
library(VariantAnnotation) library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb_hg19 <- TxDb.Hsapiens.UCSC.hg19.knownGene chr1 = "chr5" loc1 = 1599742 gr1 = GRanges(seqnames = chr1, ranges = IRanges(start=loc1, loc1)) genome(gr1) = "hg19" gr1_annot <- locateVariants(gr1, txdb_hg19, AllVariants())
I get the empty output:
> gr1_annot GRanges object with 0 ranges and 9 metadata columns: seqnames ranges strand | LOCATION LOCSTART LOCEND QUERYID TXID CDSID GENEID <Rle> <IRanges> <Rle> | <factor> <integer> <integer> <integer> <integer> <IntegerList> <character> PRECEDEID FOLLOWID <CharacterList> <CharacterList> ------- seqinfo: no sequences
But when I run the following code, it shows that there is a gene in that locus:
genes <- genes(txdb_hg19) olaps <- findOverlaps(gr1, genes)
The result:
> olaps Hits object with 1 hit and 0 metadata columns: queryHits subjectHits <integer> <integer> [1] 1 18748 ------- queryLength: 1 / subjectLength: 23056
Do you have any advice, what am I doing wrong? What should i do to obtain a proper annotation of that locus by locateVariants?
Best regards,
Aleksandra
> sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 Matrix products: default locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.28.4 [3] AnnotationDbi_1.38.2 VariantAnnotation_1.22.3 [5] Rsamtools_1.28.0 Biostrings_2.44.2 [7] XVector_0.16.0 SummarizedExperiment_1.6.3 [9] DelayedArray_0.2.7 matrixStats_0.52.2 [11] Biobase_2.36.2 GenomicRanges_1.28.4 [13] GenomeInfoDb_1.12.2 IRanges_2.10.2 [15] S4Vectors_0.14.3 BiocGenerics_0.22.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.12 compiler_3.4.1 bitops_1.0-6 tools_3.4.1 [5] zlibbioc_1.22.0 biomaRt_2.32.1 digest_0.6.12 bit_1.1-12 [9] RSQLite_2.0 memoise_1.1.0 tibble_1.3.3 lattice_0.20-35 [13] BSgenome_1.44.0 pkgconfig_2.0.1 rlang_0.1.2 Matrix_1.2-10 [17] DBI_0.7 GenomeInfoDbData_0.99.0 rtracklayer_1.36.4 bit64_0.9-7 [21] grid_3.4.1 XML_3.98-1.9 BiocParallel_1.10.1 blob_1.1.0 [25] GenomicAlignments_1.12.1 RCurl_1.95-4.8