SNP annotation in R

0

Entering edit mode

Ina Hoeschele ▴ 620

@ina-hoeschele-2992

Last seen 3.3 years ago

United States

Hi, I am interested in annotating lists of SNPs in R, mostly I am interested in finding the gene a SNP is located in and for intergenic SNPs which genes are close by. I found this code that wsa developed with the help of Valerie Obenchain: http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ However, when I adapted this code to my problem I got the error message below. Can someone please give me a hint? Thanks, Ina input <- SNP.map.retain[SNPs,1:3] rownames(input) <- NULL colnames(input) <- c("rsid", "chr", "pos") input # rsid chr pos #1 rs661761 1 30618672 #2 rs3754508 1 17255474 #3 rs6426403 1 4169394 #4 rs228727 1 7770423 #5 rs1635592 1 17535994 target <- with(input, GRanges( seqnames = Rle(chr), ranges = IRanges(pos, end=pos, names=rsid), strand = Rle(strand("*")) ) ) target #GRanges with 5 ranges and 0 metadata columns: # seqnames ranges strand # <rle> <iranges> <rle> # rs661761 1 [30618672, 30618672] * # rs3754508 1 [17255474, 17255474] * # rs6426403 1 [ 4169394, 4169394] * # rs228727 1 [ 7770423, 7770423] * # rs1635592 1 [17535994, 17535994] * # --- # seqlengths: # 1 # NA loc <- locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene, AllVariants()) Error in .mk_isActiveSeqReplacementValue(x, value) : the names of the supplied 'isActiveSeq' must match the names of the current 'isActiveSeq'

SNP SNP • 3.5k views

ADD COMMENT • link 11.5 years ago Ina Hoeschele ▴ 620

0

Entering edit mode

Ina Hoeschele ▴ 620

@ina-hoeschele-2992

Last seen 3.3 years ago

United States

Hi, I am interested in annotating lists of SNPs in R, mostly I am interested in finding the gene a SNP is located in and for intergenic SNPs which genes are close by. I found this code that wsa developed with the help of Valerie Obenchain: http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ However, when I adapted this code to my problem I got the error message below. Can someone please give me a hint? Thanks, Ina input <- SNP.map.retain[SNPs,1:3] rownames(input) <- NULL colnames(input) <- c("rsid", "chr", "pos") input # rsid chr pos #1 rs661761 1 30618672 #2 rs3754508 1 17255474 #3 rs6426403 1 4169394 #4 rs228727 1 7770423 #5 rs1635592 1 17535994 target <- with(input, GRanges( seqnames = Rle(chr), ranges = IRanges(pos, end=pos, names=rsid), strand = Rle(strand("*")) ) ) target #GRanges with 5 ranges and 0 metadata columns: # seqnames ranges strand # <rle> <iranges> <rle> # rs661761 1 [30618672, 30618672] * # rs3754508 1 [17255474, 17255474] * # rs6426403 1 [ 4169394, 4169394] * # rs228727 1 [ 7770423, 7770423] * # rs1635592 1 [17535994, 17535994] * # --- # seqlengths: # 1 # NA loc <- locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene, AllVariants()) Error in .mk_isActiveSeqReplacementValue(x, value) : the names of the supplied 'isActiveSeq' must match the names of the current 'isActiveSeq'

ADD COMMENT • link 11.5 years ago Ina Hoeschele ▴ 620

0

Entering edit mode

Hi Ina, It looks like the chromosome names in 'target' don't match those in the TranscriptDb. There is an example on the locateVariants man page of how to inspect the seqlevels (chromosome names) with seqlevels() and change them with renameSeqlevels(). Once you've made the change, confirm that the names match with the intersect() function. This is also demonstrated on the man page. Valerie On 06/18/2013 01:47 PM, Ina Hoeschele wrote: > Hi, > I am interested in annotating lists of SNPs in R, mostly I am interested in finding the gene a SNP is located in and for intergenic SNPs which genes are close by. I found this code that wsa developed with the help of Valerie Obenchain: > http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ > However, when I adapted this code to my problem I got the error message below. Can someone please give me a hint? > Thanks, Ina > > input <- SNP.map.retain[SNPs,1:3] > rownames(input) <- NULL > colnames(input) <- c("rsid", "chr", "pos") > input > # rsid chr pos > #1 rs661761 1 30618672 > #2 rs3754508 1 17255474 > #3 rs6426403 1 4169394 > #4 rs228727 1 7770423 > #5 rs1635592 1 17535994 > target <- with(input, > GRanges( seqnames = Rle(chr), > ranges = IRanges(pos, end=pos, names=rsid), > strand = Rle(strand("*")) ) ) > target > #GRanges with 5 ranges and 0 metadata columns: > # seqnames ranges strand > # <rle> <iranges> <rle> > # rs661761 1 [30618672, 30618672] * > # rs3754508 1 [17255474, 17255474] * > # rs6426403 1 [ 4169394, 4169394] * > # rs228727 1 [ 7770423, 7770423] * > # rs1635592 1 [17535994, 17535994] * > # --- > # seqlengths: > # 1 > # NA > > loc <- locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene, AllVariants()) > Error in .mk_isActiveSeqReplacementValue(x, value) : > the names of the supplied 'isActiveSeq' must match the names of the current 'isActiveSeq' > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 11.5 years ago Valerie Obenchain ★ 6.8k

0

Entering edit mode

Hi Ina, It looks like the chromosome names in 'target' don't match those in the TranscriptDb. There is an example on the locateVariants man page of how to inspect the seqlevels (chromosome names) with seqlevels() and change them with renameSeqlevels(). Once you've made the change, confirm that the names match with the intersect() function. This is also demonstrated on the man page. Valerie On 06/18/2013 01:47 PM, Ina Hoeschele wrote: > Hi, > I am interested in annotating lists of SNPs in R, mostly I am interested in finding the gene a SNP is located in and for intergenic SNPs which genes are close by. I found this code that wsa developed with the help of Valerie Obenchain: > http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ > However, when I adapted this code to my problem I got the error message below. Can someone please give me a hint? > Thanks, Ina > > input <- SNP.map.retain[SNPs,1:3] > rownames(input) <- NULL > colnames(input) <- c("rsid", "chr", "pos") > input > # rsid chr pos > #1 rs661761 1 30618672 > #2 rs3754508 1 17255474 > #3 rs6426403 1 4169394 > #4 rs228727 1 7770423 > #5 rs1635592 1 17535994 > target <- with(input, > GRanges( seqnames = Rle(chr), > ranges = IRanges(pos, end=pos, names=rsid), > strand = Rle(strand("*")) ) ) > target > #GRanges with 5 ranges and 0 metadata columns: > # seqnames ranges strand > # <rle> <iranges> <rle> > # rs661761 1 [30618672, 30618672] * > # rs3754508 1 [17255474, 17255474] * > # rs6426403 1 [ 4169394, 4169394] * > # rs228727 1 [ 7770423, 7770423] * > # rs1635592 1 [17535994, 17535994] * > # --- > # seqlengths: > # 1 > # NA > > loc <- locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene, AllVariants()) > Error in .mk_isActiveSeqReplacementValue(x, value) : > the names of the supplied 'isActiveSeq' must match the names of the current 'isActiveSeq' > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 11.5 years ago Valerie Obenchain ★ 6.8k

Login before adding your answer.