SNP annotation in R
1
0
Entering edit mode
Ina Hoeschele ▴ 620
@ina-hoeschele-2992
Last seen 2.7 years ago
United States
Hi, I am interested in annotating lists of SNPs in R, mostly I am interested in finding the gene a SNP is located in and for intergenic SNPs which genes are close by. I found this code that wsa developed with the help of Valerie Obenchain: http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ However, when I adapted this code to my problem I got the error message below. Can someone please give me a hint? Thanks, Ina input <- SNP.map.retain[SNPs,1:3] rownames(input) <- NULL colnames(input) <- c("rsid", "chr", "pos") input # rsid chr pos #1 rs661761 1 30618672 #2 rs3754508 1 17255474 #3 rs6426403 1 4169394 #4 rs228727 1 7770423 #5 rs1635592 1 17535994 target <- with(input, GRanges( seqnames = Rle(chr), ranges = IRanges(pos, end=pos, names=rsid), strand = Rle(strand("*")) ) ) target #GRanges with 5 ranges and 0 metadata columns: # seqnames ranges strand # <rle> <iranges> <rle> # rs661761 1 [30618672, 30618672] * # rs3754508 1 [17255474, 17255474] * # rs6426403 1 [ 4169394, 4169394] * # rs228727 1 [ 7770423, 7770423] * # rs1635592 1 [17535994, 17535994] * # --- # seqlengths: # 1 # NA loc <- locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene, AllVariants()) Error in .mk_isActiveSeqReplacementValue(x, value) : the names of the supplied 'isActiveSeq' must match the names of the current 'isActiveSeq'
SNP SNP • 3.2k views
ADD COMMENT
0
Entering edit mode
Ina Hoeschele ▴ 620
@ina-hoeschele-2992
Last seen 2.7 years ago
United States
Hi, I am interested in annotating lists of SNPs in R, mostly I am interested in finding the gene a SNP is located in and for intergenic SNPs which genes are close by. I found this code that wsa developed with the help of Valerie Obenchain: http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ However, when I adapted this code to my problem I got the error message below. Can someone please give me a hint? Thanks, Ina input <- SNP.map.retain[SNPs,1:3] rownames(input) <- NULL colnames(input) <- c("rsid", "chr", "pos") input # rsid chr pos #1 rs661761 1 30618672 #2 rs3754508 1 17255474 #3 rs6426403 1 4169394 #4 rs228727 1 7770423 #5 rs1635592 1 17535994 target <- with(input, GRanges( seqnames = Rle(chr), ranges = IRanges(pos, end=pos, names=rsid), strand = Rle(strand("*")) ) ) target #GRanges with 5 ranges and 0 metadata columns: # seqnames ranges strand # <rle> <iranges> <rle> # rs661761 1 [30618672, 30618672] * # rs3754508 1 [17255474, 17255474] * # rs6426403 1 [ 4169394, 4169394] * # rs228727 1 [ 7770423, 7770423] * # rs1635592 1 [17535994, 17535994] * # --- # seqlengths: # 1 # NA loc <- locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene, AllVariants()) Error in .mk_isActiveSeqReplacementValue(x, value) : the names of the supplied 'isActiveSeq' must match the names of the current 'isActiveSeq'
ADD COMMENT
0
Entering edit mode
Hi Ina, It looks like the chromosome names in 'target' don't match those in the TranscriptDb. There is an example on the locateVariants man page of how to inspect the seqlevels (chromosome names) with seqlevels() and change them with renameSeqlevels(). Once you've made the change, confirm that the names match with the intersect() function. This is also demonstrated on the man page. Valerie On 06/18/2013 01:47 PM, Ina Hoeschele wrote: > Hi, > I am interested in annotating lists of SNPs in R, mostly I am interested in finding the gene a SNP is located in and for intergenic SNPs which genes are close by. I found this code that wsa developed with the help of Valerie Obenchain: > http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ > However, when I adapted this code to my problem I got the error message below. Can someone please give me a hint? > Thanks, Ina > > input <- SNP.map.retain[SNPs,1:3] > rownames(input) <- NULL > colnames(input) <- c("rsid", "chr", "pos") > input > # rsid chr pos > #1 rs661761 1 30618672 > #2 rs3754508 1 17255474 > #3 rs6426403 1 4169394 > #4 rs228727 1 7770423 > #5 rs1635592 1 17535994 > target <- with(input, > GRanges( seqnames = Rle(chr), > ranges = IRanges(pos, end=pos, names=rsid), > strand = Rle(strand("*")) ) ) > target > #GRanges with 5 ranges and 0 metadata columns: > # seqnames ranges strand > # <rle> <iranges> <rle> > # rs661761 1 [30618672, 30618672] * > # rs3754508 1 [17255474, 17255474] * > # rs6426403 1 [ 4169394, 4169394] * > # rs228727 1 [ 7770423, 7770423] * > # rs1635592 1 [17535994, 17535994] * > # --- > # seqlengths: > # 1 > # NA > > loc <- locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene, AllVariants()) > Error in .mk_isActiveSeqReplacementValue(x, value) : > the names of the supplied 'isActiveSeq' must match the names of the current 'isActiveSeq' > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Hi Ina, It looks like the chromosome names in 'target' don't match those in the TranscriptDb. There is an example on the locateVariants man page of how to inspect the seqlevels (chromosome names) with seqlevels() and change them with renameSeqlevels(). Once you've made the change, confirm that the names match with the intersect() function. This is also demonstrated on the man page. Valerie On 06/18/2013 01:47 PM, Ina Hoeschele wrote: > Hi, > I am interested in annotating lists of SNPs in R, mostly I am interested in finding the gene a SNP is located in and for intergenic SNPs which genes are close by. I found this code that wsa developed with the help of Valerie Obenchain: > http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ > However, when I adapted this code to my problem I got the error message below. Can someone please give me a hint? > Thanks, Ina > > input <- SNP.map.retain[SNPs,1:3] > rownames(input) <- NULL > colnames(input) <- c("rsid", "chr", "pos") > input > # rsid chr pos > #1 rs661761 1 30618672 > #2 rs3754508 1 17255474 > #3 rs6426403 1 4169394 > #4 rs228727 1 7770423 > #5 rs1635592 1 17535994 > target <- with(input, > GRanges( seqnames = Rle(chr), > ranges = IRanges(pos, end=pos, names=rsid), > strand = Rle(strand("*")) ) ) > target > #GRanges with 5 ranges and 0 metadata columns: > # seqnames ranges strand > # <rle> <iranges> <rle> > # rs661761 1 [30618672, 30618672] * > # rs3754508 1 [17255474, 17255474] * > # rs6426403 1 [ 4169394, 4169394] * > # rs228727 1 [ 7770423, 7770423] * > # rs1635592 1 [17535994, 17535994] * > # --- > # seqlengths: > # 1 > # NA > > loc <- locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene, AllVariants()) > Error in .mk_isActiveSeqReplacementValue(x, value) : > the names of the supplied 'isActiveSeq' must match the names of the current 'isActiveSeq' > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 767 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6