Entering edit mode
Dear all,
I am finding some unexpected results (to me anyway) with the
VariantAnnotation package. Basically, there are situations where the
GENEID is missing when LOCATION is either coding, promoter, intron,
threeUTR or fiveUTR. Here is an example with five SNPs (among many
more). I have marked the unexpected results with "##".
library(VariantAnnotation); library(TxDb.Hsapiens.UCSC.hg19.knownGene)
tmp <- rbind.data.frame(c("rs10917388", "chr1", 23803138),
c("rs1063412", "chr1", 172410967),
c("rs78291220", "chr2", 60890373),
c("rs116917239", "chr17", 44061025),
c("rs11593", "chrX", 153627145)
)
colnames(tmp) <- c("rsid", "chr", "pos")
tmp$pos <- as.numeric( as.character(tmp$pos) )
target <- with(tmp, GRanges(seqnames = Rle(chr),
ranges = IRanges(pos,
end=pos, names=rsid),
strand = Rle(strand("*"))
) )
loc <- locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene,
AllVariants())
names(loc) <- NULL
out <- as.data.frame(loc)
out$rsid <- names(target)[ out$QUERYID ]
out <- out[ , c("rsid", "seqnames", "start", "LOCATION", "GENEID",
"PRECEDEID", "FOLLOWID")]
out <- unique(out)
rownames(out) <- NULL
out
rsid seqnames start LOCATION GENEID PRECEDEID
FOLLOWID
1 rs10917388 chr1 23803138 intron 55616 <na> <na>
2 rs10917388 chr1 23803138 promoter <na> <na> <na>
##
3 rs1063412 chr1 172410967 intron 92346 <na> <na>
4 rs1063412 chr1 172410967 intron 5279 <na> <na>
5 rs1063412 chr1 172410967 coding 5279 <na> <na>
6 rs1063412 chr1 172410967 coding <na> <na> <na>
##
7 rs78291220 chr2 60890373 promoter <na> <na> <na>
##
8 rs78291220 chr2 60890373 intergenic <na> 64895 400957
9 rs116917239 chr17 44061025 coding 4137 <na> <na>
10 rs116917239 chr17 44061025 intron 4137 <na> <na>
11 rs116917239 chr17 44061025 coding <na> <na> <na>
##
12 rs11593 chrX 153627145 intron 6134 <na> <na>
13 rs11593 chrX 153627145 promoter 6134 <na> <na>
14 rs11593 chrX 153627145 promoter 26778 <na> <na>
15 rs11593 chrX 153627145 promoter <na> <na> <na>
##
16 rs11593 chrX 153627145 fiveUTR <na> <na>
<na> ##
17 rs11593 chrX 153627145 threeUTR <na> <na> <na>
##
Can anyone help explain what is happening please? Is this to be
expected? Thank you.
Regards, Adai