Hi,
I am trying to get the MAF values for the variants in my VCF file in R by biomaRt.
Because the reference file I used to generated the VCF file is HG19. So I cannot use the latest version of the ensembl DB. I checked the Ensembl DB web site and find out the latest version for HG19 (GRCh37) is the archive of Feb 2014(http://useast.ensembl.org/info/website/archives/index.html).
Thus, I connect to this archive in biomaRt and try to get the MAF for my VCF files.
I did the following in R and try to get the MAF for SNP rs200322093:
library("biomaRt")
snpdetail=useMart("ENSEMBL_MART_SNP", dataset="hsapiens_snp",host="feb2014.archive.ensembl.org", path="/biomart/martservice",archive=FALSE)
getBM(attributes=c("minor_allele_freq","refsnp_id"),filters="snp_filter",value="rs200322093",snpdetail)
The result I got is NA:
minor_allele_freq refsnp_id
1 NA rs200322093
But I checked dbSNP website, for GRCH37, this snp does have a MAF value, which value is 0.0012.(http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=200322093)
Does anyone know the reason?
In addition, I notice in the latest version of the DB, it has a field called "minor_allele_freq_second", what's the difference between it and MAF?
Thank you.
Thanks. It works.
Hi, Thomas:
Is this link the latest version of hg19? I tried to get the MAF of SNP rs4829390 from your website. It gave me NA. But I checked dbSNP, the MAF should be 0.3796. Do you have a link of the latest version using hg19?
Thank you.
Hello,
I am afraid we don't have MAF information for rs4829390 on our GRCh37 website: http://grch37.ensembl.org/index.html as we are displaying informations imported from dbSNP 142. We are planning to import dbSNP 144 for our next GRCh38 and GRCh37 release e!82 scheduled for late September 2015 (please keep an eye on our announce blog: http://www.ensembl.info).
Hope this helps,
Regards,
Thomas