Hi all, I am developing a method to select automatically primers to validate splicing events. I am using SNPlocs.Hsapiens.dbSNP144.GRCh37 and injectSNSPs to identify regions with genomic variants (to avoid placing the primers on them). So far, so good.
However, the number of SNPs is very high (around 150 million) and is almost impossible to find a sufficiently large region with no variants to place the primers. Is it possible to inject on the reference genome only the SNPs with a minor allele frequency (MAF) larger than a threshold (say 5%)? Is the information of the MAF somewhere in the annotation data of bioconductor?