stranded intronic variants with VariantAnnotation::locateVariants()
1
0
Entering edit mode
Robert Castelo ★ 3.2k
@rcastelo
Last seen 18 hours ago
Barcelona/Universitat Pompeu Fabra
hi, i have the following feature request for the VariantAnnotation package. currently, the function predictCoding() annotates the strand of variants within exons according to a given gene annotation. would it be possible that the function locateVariants() in the VariantAnnotation package annotates the strand for intronic variants? introns are non-coding, and therefore, not annotated with predictCoding(), but are stranded (GT-AG). here goes a code snippet that illustrates what i'm talking about (adapted from the vignette): ================= library(VariantAnnotation) library(TxDb.Hsapiens.UCSC.hg19.knownGene) fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation") vcf <- readVcf(fl, "hg19") txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene seqlevels(vcf) <- "chr22" rd <- rowData(vcf) loc <- locateVariants(rd, txdb, IntronVariants()) head(loc, n=3) GRanges with 3 ranges and 7 metadata columns: seqnames ranges strand | LOCATION QUERYID TXID CDSID GENEID <rle> <iranges> <rle> | <factor> <integer> <integer> <integer> <character> [1] chr22 [50300078, 50300078] * | intron 1 75253 <na> 79087 [2] chr22 [50300086, 50300086] * | intron 2 75253 <na> 79087 [3] chr22 [50300101, 50300101] * | intron 3 75253 <na> 79087 PRECEDEID FOLLOWID <characterlist> <characterlist> [1] [2] [3] --- seqlengths: chr22 NA ================= i.e., the strand column is set to * for the intronic variants. it's ok if this new feature would be added to the devel version, as happens normally with new features. thanks! robert. ps: sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1 [2] GenomicFeatures_1.14.0 [3] AnnotationDbi_1.24.0 [4] Biobase_2.22.0 [5] VariantAnnotation_1.8.0 [6] Rsamtools_1.14.1 [7] Biostrings_2.30.0 [8] GenomicRanges_1.14.1 [9] XVector_0.2.0 [10] IRanges_1.20.0 [11] BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] biomaRt_2.18.0 bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7 [5] RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.22.0 stats4_3.0.2 [9] tools_3.0.2 XML_3.95-0.2 zlibbioc_1.8.0
VariantAnnotation Annotation VariantAnnotation VariantAnnotation Annotation VariantAnnotation • 985 views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.3 years ago
United States
Hi Robert, Yes, I can add that. I'll let you know when it's done. Valerie On 10/17/2013 04:01 AM, Robert Castelo wrote: > hi, > > i have the following feature request for the VariantAnnotation package. > > currently, the function predictCoding() annotates the strand of variants > within exons according to a given gene annotation. would it be possible > that the function locateVariants() in the VariantAnnotation package > annotates the strand for intronic variants? > > introns are non-coding, and therefore, not annotated with > predictCoding(), but are stranded (GT-AG). > > here goes a code snippet that illustrates what i'm talking about > (adapted from the vignette): > > ================= > library(VariantAnnotation) > library(TxDb.Hsapiens.UCSC.hg19.knownGene) > > fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation") > vcf <- readVcf(fl, "hg19") > txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene > seqlevels(vcf) <- "chr22" > rd <- rowData(vcf) > loc <- locateVariants(rd, txdb, IntronVariants()) > > head(loc, n=3) > GRanges with 3 ranges and 7 metadata columns: > seqnames ranges strand | LOCATION QUERYID > TXID CDSID GENEID > <rle> <iranges> <rle> | <factor> <integer> > <integer> <integer> <character> > [1] chr22 [50300078, 50300078] * | intron 1 > 75253 <na> 79087 > [2] chr22 [50300086, 50300086] * | intron 2 > 75253 <na> 79087 > [3] chr22 [50300101, 50300101] * | intron 3 > 75253 <na> 79087 > PRECEDEID FOLLOWID > <characterlist> <characterlist> > [1] > [2] > [3] > --- > seqlengths: > chr22 > NA > ================= > > i.e., the strand column is set to * for the intronic variants. it's ok > if this new feature would be added to the devel version, as happens > normally with new features. > > > thanks! > robert. > ps: sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1 > [2] GenomicFeatures_1.14.0 > [3] AnnotationDbi_1.24.0 > [4] Biobase_2.22.0 > [5] VariantAnnotation_1.8.0 > [6] Rsamtools_1.14.1 > [7] Biostrings_2.30.0 > [8] GenomicRanges_1.14.1 > [9] XVector_0.2.0 > [10] IRanges_1.20.0 > [11] BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] biomaRt_2.18.0 bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7 > [5] RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.22.0 stats4_3.0.2 > [9] tools_3.0.2 XML_3.95-0.2 zlibbioc_1.8.0 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Great! thanks a lot Valerie!! robert. On 10/18/13 10:19 PM, Valerie Obenchain wrote: > Hi Robert, > > Yes, I can add that. I'll let you know when it's done. > > Valerie > > On 10/17/2013 04:01 AM, Robert Castelo wrote: >> hi, >> >> i have the following feature request for the VariantAnnotation package. >> >> currently, the function predictCoding() annotates the strand of variants >> within exons according to a given gene annotation. would it be possible >> that the function locateVariants() in the VariantAnnotation package >> annotates the strand for intronic variants? >> >> introns are non-coding, and therefore, not annotated with >> predictCoding(), but are stranded (GT-AG). >> >> here goes a code snippet that illustrates what i'm talking about >> (adapted from the vignette): >> >> ================= >> library(VariantAnnotation) >> library(TxDb.Hsapiens.UCSC.hg19.knownGene) >> >> fl <- system.file("extdata", "chr22.vcf.gz", >> package="VariantAnnotation") >> vcf <- readVcf(fl, "hg19") >> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene >> seqlevels(vcf) <- "chr22" >> rd <- rowData(vcf) >> loc <- locateVariants(rd, txdb, IntronVariants()) >> >> head(loc, n=3) >> GRanges with 3 ranges and 7 metadata columns: >> seqnames ranges strand | LOCATION QUERYID >> TXID CDSID GENEID >> <rle> <iranges> <rle> | <factor> <integer> >> <integer> <integer> <character> >> [1] chr22 [50300078, 50300078] * | intron 1 >> 75253 <na> 79087 >> [2] chr22 [50300086, 50300086] * | intron 2 >> 75253 <na> 79087 >> [3] chr22 [50300101, 50300101] * | intron 3 >> 75253 <na> 79087 >> PRECEDEID FOLLOWID >> <characterlist> <characterlist> >> [1] >> [2] >> [3] >> --- >> seqlengths: >> chr22 >> NA >> ================= >> >> i.e., the strand column is set to * for the intronic variants. it's ok >> if this new feature would be added to the devel version, as happens >> normally with new features. >> >> >> thanks! >> robert. >> ps: sessionInfo() >> R version 3.0.2 (2013-09-25) >> Platform: x86_64-apple-darwin10.8.0 (64-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1 >> [2] GenomicFeatures_1.14.0 >> [3] AnnotationDbi_1.24.0 >> [4] Biobase_2.22.0 >> [5] VariantAnnotation_1.8.0 >> [6] Rsamtools_1.14.1 >> [7] Biostrings_2.30.0 >> [8] GenomicRanges_1.14.1 >> [9] XVector_0.2.0 >> [10] IRanges_1.20.0 >> [11] BiocGenerics_0.8.0 >> >> loaded via a namespace (and not attached): >> [1] biomaRt_2.18.0 bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7 >> [5] RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.22.0 >> stats4_3.0.2 >> [9] tools_3.0.2 XML_3.95-0.2 zlibbioc_1.8.0 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 485 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6