Question: injectSNPS only for common variants
0
3.1 years ago by
arubio0
arubio0 wrote:

Hi all, I am developing a method to select automatically primers to validate splicing events. I am using SNPlocs.Hsapiens.dbSNP144.GRCh37 and injectSNSPs to identify regions with genomic variants (to avoid placing the primers on them). So far, so good.

However, the number of SNPs is very high (around 150 million) and is almost impossible to find a sufficiently large region with no variants to place the primers. Is it possible to inject on the reference genome only the SNPs with a minor allele frequency (MAF) larger than a threshold (say 5%)? Is the information of the MAF somewhere in the annotation data of bioconductor?

Thanks,

Angel

snp snplocs • 676 views
modified 3.1 years ago by Robert Castelo2.3k • written 3.1 years ago by arubio0

MAF is a population-specific parameter.  You may be able to get some information out of AnnotationHub; I can't verify as I have a terrible connection at the moment.

> query(ah, "Common SNPs")
AnnotationHub with 3 records
# snapshotDate(): 2016-10-11
# $dataprovider: UCSC #$species: Homo sapiens
# \$rdataclass: GRanges
# additional mcols(): taxonomyid, genome, description,
#   sourceurl, sourcetype
# retrieve records with, e.g., 'object[["AH5105"]]'

title

AH5105 | Common SNPs(137)

AH5108 | Common SNPs(135)

AH5111 | Common SNPs(132)

It seems that UCSC has a table that will have the information

so you may be able to get useful statistics with rtracklayer.  Another approach is to query the 1000 genomes VCFs; snpStats::col.summary will compute MAF, via VariantAnnotation::genotypesToSnpMatrix

Answer: injectSNPS only for common variants
0
3.1 years ago by
Robert Castelo2.3k
Spain/Barcelona/Universitat Pompeu Fabra
Robert Castelo2.3k wrote:

Hi,

the MafDb.* annotation packages store MAF values from a number of sources:

MafDb.1Kgenomes.phase1.hs37d5
MafDb.1Kgenomes.phase3.hs37d5
MafDb.ESP6500SI.V2.SSA137.GRCh38
MafDb.ESP6500SI.V2.SSA137.hs37d5
MafDb.ExAC.r0.3.1.nonTCGA.snvs.hs37d5
MafDb.ExAC.r0.3.1.snvs.hs37d5


to access the values through those packages you should install first the package you need, then load it and use the function mafByOverlaps() or mafById(). type the following to see an example:

library(MafDb.1Kgenomes.phase3.hs37d5)
example(MafDb.1Kgenomes.phase3.hs37d5)

cheers,

robert.