Question: stranded intronic variants with VariantAnnotation::locateVariants()
0
gravatar for Robert Castelo
5.3 years ago by
Robert Castelo2.3k
Spain/Barcelona/Universitat Pompeu Fabra
Robert Castelo2.3k wrote:
hi, i have the following feature request for the VariantAnnotation package. currently, the function predictCoding() annotates the strand of variants within exons according to a given gene annotation. would it be possible that the function locateVariants() in the VariantAnnotation package annotates the strand for intronic variants? introns are non-coding, and therefore, not annotated with predictCoding(), but are stranded (GT-AG). here goes a code snippet that illustrates what i'm talking about (adapted from the vignette): ================= library(VariantAnnotation) library(TxDb.Hsapiens.UCSC.hg19.knownGene) fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation") vcf <- readVcf(fl, "hg19") txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene seqlevels(vcf) <- "chr22" rd <- rowData(vcf) loc <- locateVariants(rd, txdb, IntronVariants()) head(loc, n=3) GRanges with 3 ranges and 7 metadata columns: seqnames ranges strand | LOCATION QUERYID TXID CDSID GENEID <rle> <iranges> <rle> | <factor> <integer> <integer> <integer> <character> [1] chr22 [50300078, 50300078] * | intron 1 75253 <na> 79087 [2] chr22 [50300086, 50300086] * | intron 2 75253 <na> 79087 [3] chr22 [50300101, 50300101] * | intron 3 75253 <na> 79087 PRECEDEID FOLLOWID <characterlist> <characterlist> [1] [2] [3] --- seqlengths: chr22 NA ================= i.e., the strand column is set to * for the intronic variants. it's ok if this new feature would be added to the devel version, as happens normally with new features. thanks! robert. ps: sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1 [2] GenomicFeatures_1.14.0 [3] AnnotationDbi_1.24.0 [4] Biobase_2.22.0 [5] VariantAnnotation_1.8.0 [6] Rsamtools_1.14.1 [7] Biostrings_2.30.0 [8] GenomicRanges_1.14.1 [9] XVector_0.2.0 [10] IRanges_1.20.0 [11] BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] biomaRt_2.18.0 bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7 [5] RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.22.0 stats4_3.0.2 [9] tools_3.0.2 XML_3.95-0.2 zlibbioc_1.8.0
ADD COMMENTlink modified 5.3 years ago by Valerie Obenchain6.7k • written 5.3 years ago by Robert Castelo2.3k
Answer: stranded intronic variants with VariantAnnotation::locateVariants()
0
gravatar for Valerie Obenchain
5.3 years ago by
United States
Valerie Obenchain6.7k wrote:
Hi Robert, Yes, I can add that. I'll let you know when it's done. Valerie On 10/17/2013 04:01 AM, Robert Castelo wrote: > hi, > > i have the following feature request for the VariantAnnotation package. > > currently, the function predictCoding() annotates the strand of variants > within exons according to a given gene annotation. would it be possible > that the function locateVariants() in the VariantAnnotation package > annotates the strand for intronic variants? > > introns are non-coding, and therefore, not annotated with > predictCoding(), but are stranded (GT-AG). > > here goes a code snippet that illustrates what i'm talking about > (adapted from the vignette): > > ================= > library(VariantAnnotation) > library(TxDb.Hsapiens.UCSC.hg19.knownGene) > > fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation") > vcf <- readVcf(fl, "hg19") > txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene > seqlevels(vcf) <- "chr22" > rd <- rowData(vcf) > loc <- locateVariants(rd, txdb, IntronVariants()) > > head(loc, n=3) > GRanges with 3 ranges and 7 metadata columns: > seqnames ranges strand | LOCATION QUERYID > TXID CDSID GENEID > <rle> <iranges> <rle> | <factor> <integer> > <integer> <integer> <character> > [1] chr22 [50300078, 50300078] * | intron 1 > 75253 <na> 79087 > [2] chr22 [50300086, 50300086] * | intron 2 > 75253 <na> 79087 > [3] chr22 [50300101, 50300101] * | intron 3 > 75253 <na> 79087 > PRECEDEID FOLLOWID > <characterlist> <characterlist> > [1] > [2] > [3] > --- > seqlengths: > chr22 > NA > ================= > > i.e., the strand column is set to * for the intronic variants. it's ok > if this new feature would be added to the devel version, as happens > normally with new features. > > > thanks! > robert. > ps: sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1 > [2] GenomicFeatures_1.14.0 > [3] AnnotationDbi_1.24.0 > [4] Biobase_2.22.0 > [5] VariantAnnotation_1.8.0 > [6] Rsamtools_1.14.1 > [7] Biostrings_2.30.0 > [8] GenomicRanges_1.14.1 > [9] XVector_0.2.0 > [10] IRanges_1.20.0 > [11] BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] biomaRt_2.18.0 bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7 > [5] RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.22.0 stats4_3.0.2 > [9] tools_3.0.2 XML_3.95-0.2 zlibbioc_1.8.0 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENTlink written 5.3 years ago by Valerie Obenchain6.7k
Great! thanks a lot Valerie!! robert. On 10/18/13 10:19 PM, Valerie Obenchain wrote: > Hi Robert, > > Yes, I can add that. I'll let you know when it's done. > > Valerie > > On 10/17/2013 04:01 AM, Robert Castelo wrote: >> hi, >> >> i have the following feature request for the VariantAnnotation package. >> >> currently, the function predictCoding() annotates the strand of variants >> within exons according to a given gene annotation. would it be possible >> that the function locateVariants() in the VariantAnnotation package >> annotates the strand for intronic variants? >> >> introns are non-coding, and therefore, not annotated with >> predictCoding(), but are stranded (GT-AG). >> >> here goes a code snippet that illustrates what i'm talking about >> (adapted from the vignette): >> >> ================= >> library(VariantAnnotation) >> library(TxDb.Hsapiens.UCSC.hg19.knownGene) >> >> fl <- system.file("extdata", "chr22.vcf.gz", >> package="VariantAnnotation") >> vcf <- readVcf(fl, "hg19") >> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene >> seqlevels(vcf) <- "chr22" >> rd <- rowData(vcf) >> loc <- locateVariants(rd, txdb, IntronVariants()) >> >> head(loc, n=3) >> GRanges with 3 ranges and 7 metadata columns: >> seqnames ranges strand | LOCATION QUERYID >> TXID CDSID GENEID >> <rle> <iranges> <rle> | <factor> <integer> >> <integer> <integer> <character> >> [1] chr22 [50300078, 50300078] * | intron 1 >> 75253 <na> 79087 >> [2] chr22 [50300086, 50300086] * | intron 2 >> 75253 <na> 79087 >> [3] chr22 [50300101, 50300101] * | intron 3 >> 75253 <na> 79087 >> PRECEDEID FOLLOWID >> <characterlist> <characterlist> >> [1] >> [2] >> [3] >> --- >> seqlengths: >> chr22 >> NA >> ================= >> >> i.e., the strand column is set to * for the intronic variants. it's ok >> if this new feature would be added to the devel version, as happens >> normally with new features. >> >> >> thanks! >> robert. >> ps: sessionInfo() >> R version 3.0.2 (2013-09-25) >> Platform: x86_64-apple-darwin10.8.0 (64-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1 >> [2] GenomicFeatures_1.14.0 >> [3] AnnotationDbi_1.24.0 >> [4] Biobase_2.22.0 >> [5] VariantAnnotation_1.8.0 >> [6] Rsamtools_1.14.1 >> [7] Biostrings_2.30.0 >> [8] GenomicRanges_1.14.1 >> [9] XVector_0.2.0 >> [10] IRanges_1.20.0 >> [11] BiocGenerics_0.8.0 >> >> loaded via a namespace (and not attached): >> [1] biomaRt_2.18.0 bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7 >> [5] RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.22.0 >> stats4_3.0.2 >> [9] tools_3.0.2 XML_3.95-0.2 zlibbioc_1.8.0 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLYlink written 5.3 years ago by Robert Castelo2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 183 users visited in the last hour