VariantAnnotation: fine define Locating variants in and around genes
1
0
Entering edit mode
@fabrice-tourre-4394
Last seen 10.2 years ago
Dear list, I am using VariantAnnotation to Locate variants in and around genes. In VariantAnnotation, the region is defined as: Coding Variants, IntronVariants, FiveUTRVariants, ThreeUTRVariants, IntergenicVariants, SpliceSiteVariants or PromoterVariants. If it possible to know whether a snp is in exon/intron within transcription region but outside coding region? Thanks.
SNP VariantAnnotation VariantAnnotation SNP VariantAnnotation VariantAnnotation • 1.1k views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.9 years ago
United States
Hi Fabrice, To identify snps (or any ranges) in introns only, use IntronVariants() as the 'region' argument. The CodingVariants are the exon regions. If you want all regions except coding, I would suggest using AllVariants(). This output is from the man page example. The 'loc_coding' name is misleading since AllVariants were use as 'region'. I have changed it to 'loc_all' in the devel branch. > loc_coding <- locateVariants(vcf_adj, txdb, AllVariants()) > loc_coding GRanges with 16 ranges and 7 metadata columns: seqnames ranges strand | LOCATION QUERYID <rle> <iranges> <rle> | <factor> <integer> chr1 [ 13220, 13220] * | intron 1 chr1 [ 13220, 13220] * | spliceSite 1 chr1 [ 13220, 13220] * | intron 1 chr1 [ 13220, 13220] * | intron 1 chr1 [ 13220, 13220] * | spliceSite 1 ... ... This example has variants in splice sites, introns, coding and intergenic regions. > tbl <- table(loc_coding$LOCATION) > tbl[tbl > 0] spliceSite intron coding intergenic 2 7 2 5 The result can be subset on LOCATION for the region of interest. The QUERYID column maps back to the row number in the original 'query' argument to locateVariants(). introns <- loc_coding[loc_coding$LOCATION == "intron", ] > head(introns, 3) GRanges with 3 ranges and 7 metadata columns: seqnames ranges strand | LOCATION QUERYID TXID <rle> <iranges> <rle> | <factor> <integer> <integer> chr1 [13220, 13220] * | intron 1 1 chr1 [13220, 13220] * | intron 1 2 chr1 [13220, 13220] * | intron 1 3 Valerie On 01/31/2013 12:34 PM, Fabrice Tourre wrote: > Dear list, > > I am using VariantAnnotation to Locate variants in and around genes. > > In VariantAnnotation, the region is defined as: Coding Variants, > IntronVariants, FiveUTRVariants, ThreeUTRVariants, IntergenicVariants, > SpliceSiteVariants or PromoterVariants. > > If it possible to know whether a snp is in exon/intron within > transcription region but outside coding region? > > Thanks. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Valerie, Thank you for your reply. Is there a function in VariantAnnotation to know whether a snp is within transcription region but outside coding region? Or is it in first exon/intron? On Thu, Jan 31, 2013 at 4:30 PM, Valerie Obenchain <vobencha at="" fhcrc.org=""> wrote: > Hi Fabrice, > > To identify snps (or any ranges) in introns only, use IntronVariants() as > the 'region' argument. The CodingVariants are the exon regions. If you want > all regions except coding, I would suggest using AllVariants(). > > This output is from the man page example. The 'loc_coding' name is > misleading since AllVariants were use as 'region'. I have changed it to > 'loc_all' in the devel branch. > >> loc_coding <- locateVariants(vcf_adj, txdb, AllVariants()) >> loc_coding > GRanges with 16 ranges and 7 metadata columns: > seqnames ranges strand | LOCATION QUERYID > <rle> <iranges> <rle> | <factor> <integer> > chr1 [ 13220, 13220] * | intron 1 > chr1 [ 13220, 13220] * | spliceSite 1 > chr1 [ 13220, 13220] * | intron 1 > chr1 [ 13220, 13220] * | intron 1 > chr1 [ 13220, 13220] * | spliceSite 1 > ... > ... > > This example has variants in splice sites, introns, coding and intergenic > regions. > >> tbl <- table(loc_coding$LOCATION) >> tbl[tbl > 0] > > spliceSite intron coding intergenic > 2 7 2 5 > > The result can be subset on LOCATION for the region of interest. The QUERYID > column maps back to the row number in the original 'query' argument to > locateVariants(). > > introns <- loc_coding[loc_coding$LOCATION == "intron", ] >> head(introns, 3) > GRanges with 3 ranges and 7 metadata columns: > seqnames ranges strand | LOCATION QUERYID TXID > <rle> <iranges> <rle> | <factor> <integer> <integer> > chr1 [13220, 13220] * | intron 1 1 > chr1 [13220, 13220] * | intron 1 2 > chr1 [13220, 13220] * | intron 1 3 > > > Valerie > > > > On 01/31/2013 12:34 PM, Fabrice Tourre wrote: >> >> Dear list, >> >> I am using VariantAnnotation to Locate variants in and around genes. >> >> In VariantAnnotation, the region is defined as: Coding Variants, >> IntronVariants, FiveUTRVariants, ThreeUTRVariants, IntergenicVariants, >> SpliceSiteVariants or PromoterVariants. >> >> If it possible to know whether a snp is in exon/intron within >> transcription region but outside coding region? >> >> Thanks. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >
ADD REPLY
0
Entering edit mode
On 01/31/2013 01:48 PM, Fabrice Tourre wrote: > Valerie, > > Thank you for your reply. > > Is there a function in VariantAnnotation to know whether a snp is > within transcription region but outside coding region? Or is it in > first exon/intron? Yes, the function is called locateVariants(). Use AllVariants() as the 'region' argument and subset your result on the utr and intron regions. From the example below, myregions <- c("intron", "threeUTR", "fiveUTR") loc_coding[loc_coding$LOCATION %in% myregions] Valerie > > On Thu, Jan 31, 2013 at 4:30 PM, Valerie Obenchain <vobencha at="" fhcrc.org=""> wrote: >> Hi Fabrice, >> >> To identify snps (or any ranges) in introns only, use IntronVariants() as >> the 'region' argument. The CodingVariants are the exon regions. If you want >> all regions except coding, I would suggest using AllVariants(). >> >> This output is from the man page example. The 'loc_coding' name is >> misleading since AllVariants were use as 'region'. I have changed it to >> 'loc_all' in the devel branch. >> >>> loc_coding <- locateVariants(vcf_adj, txdb, AllVariants()) >>> loc_coding >> GRanges with 16 ranges and 7 metadata columns: >> seqnames ranges strand | LOCATION QUERYID >> <rle> <iranges> <rle> | <factor> <integer> >> chr1 [ 13220, 13220] * | intron 1 >> chr1 [ 13220, 13220] * | spliceSite 1 >> chr1 [ 13220, 13220] * | intron 1 >> chr1 [ 13220, 13220] * | intron 1 >> chr1 [ 13220, 13220] * | spliceSite 1 >> ... >> ... >> >> This example has variants in splice sites, introns, coding and intergenic >> regions. >> >>> tbl <- table(loc_coding$LOCATION) >>> tbl[tbl > 0] >> >> spliceSite intron coding intergenic >> 2 7 2 5 >> >> The result can be subset on LOCATION for the region of interest. The QUERYID >> column maps back to the row number in the original 'query' argument to >> locateVariants(). >> >> introns <- loc_coding[loc_coding$LOCATION == "intron", ] >>> head(introns, 3) >> GRanges with 3 ranges and 7 metadata columns: >> seqnames ranges strand | LOCATION QUERYID TXID >> <rle> <iranges> <rle> | <factor> <integer> <integer> >> chr1 [13220, 13220] * | intron 1 1 >> chr1 [13220, 13220] * | intron 1 2 >> chr1 [13220, 13220] * | intron 1 3 >> >> >> Valerie >> >> >> >> On 01/31/2013 12:34 PM, Fabrice Tourre wrote: >>> >>> Dear list, >>> >>> I am using VariantAnnotation to Locate variants in and around genes. >>> >>> In VariantAnnotation, the region is defined as: Coding Variants, >>> IntronVariants, FiveUTRVariants, ThreeUTRVariants, IntergenicVariants, >>> SpliceSiteVariants or PromoterVariants. >>> >>> If it possible to know whether a snp is in exon/intron within >>> transcription region but outside coding region? >>> >>> Thanks. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>
ADD REPLY

Login before adding your answer.

Traffic: 654 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6