Question: injectSNPS only for common variants
0
gravatar for arubio
3.1 years ago by
arubio0
arubio0 wrote:

 

 

Hi all, I am developing a method to select automatically primers to validate splicing events. I am using SNPlocs.Hsapiens.dbSNP144.GRCh37 and injectSNSPs to identify regions with genomic variants (to avoid placing the primers on them). So far, so good.

However, the number of SNPs is very high (around 150 million) and is almost impossible to find a sufficiently large region with no variants to place the primers. Is it possible to inject on the reference genome only the SNPs with a minor allele frequency (MAF) larger than a threshold (say 5%)? Is the information of the MAF somewhere in the annotation data of bioconductor?

Thanks,

Angel

 

 

snp snplocs • 676 views
ADD COMMENTlink modified 3.1 years ago by Robert Castelo2.3k • written 3.1 years ago by arubio0

MAF is a population-specific parameter.  You may be able to get some information out of AnnotationHub; I can't verify as I have a terrible connection at the moment.

> query(ah, "Common SNPs")
AnnotationHub with 3 records
# snapshotDate(): 2016-10-11 
# $dataprovider: UCSC
# $species: Homo sapiens
# $rdataclass: GRanges
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["AH5105"]]' 


           title           

  AH5105 | Common SNPs(137)

  AH5108 | Common SNPs(135)

  AH5111 | Common SNPs(132)

It seems that UCSC has a table that will have the information

http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/snp141Common.sql

so you may be able to get useful statistics with rtracklayer.  Another approach is to query the 1000 genomes VCFs; snpStats::col.summary will compute MAF, via VariantAnnotation::genotypesToSnpMatrix

ADD REPLYlink written 3.1 years ago by Vincent J. Carey, Jr.6.3k
Answer: injectSNPS only for common variants
0
gravatar for Robert Castelo
3.1 years ago by
Robert Castelo2.3k
Spain/Barcelona/Universitat Pompeu Fabra
Robert Castelo2.3k wrote:

Hi,

the MafDb.* annotation packages store MAF values from a number of sources:

MafDb.1Kgenomes.phase1.hs37d5
MafDb.1Kgenomes.phase3.hs37d5
MafDb.ESP6500SI.V2.SSA137.GRCh38
MafDb.ESP6500SI.V2.SSA137.hs37d5
MafDb.ExAC.r0.3.1.nonTCGA.snvs.hs37d5
MafDb.ExAC.r0.3.1.snvs.hs37d5

to access the values through those packages you should install first the package you need, then load it and use the function mafByOverlaps() or mafById(). type the following to see an example:

library(MafDb.1Kgenomes.phase3.hs37d5)
example(MafDb.1Kgenomes.phase3.hs37d5)

cheers,

robert.

ADD COMMENTlink written 3.1 years ago by Robert Castelo2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 181 users visited in the last hour