Question about the database version and MAF value in biomaRt
2
0
Entering edit mode
jz6002 • 0
@jz6002-6980
Last seen 9.3 years ago
Canada

Hi, 

I am trying to get the MAF values for the variants in my VCF file in R by biomaRt.

Because the reference file I used to generated the VCF file is HG19. So I cannot use the latest version of the ensembl DB. I checked the  Ensembl DB web site and find out the latest version for HG19 (GRCh37) is the archive of Feb 2014(http://useast.ensembl.org/info/website/archives/index.html)

Thus, I connect to this archive in biomaRt and try to get the MAF for my VCF files.

I did the following in R and try to get the MAF for SNP rs200322093:

library("biomaRt")
snpdetail=useMart("ENSEMBL_MART_SNP", dataset="hsapiens_snp",host="feb2014.archive.ensembl.org", path="/biomart/martservice",archive=FALSE)
getBM(attributes=c("minor_allele_freq","refsnp_id"),filters="snp_filter",value="rs200322093",snpdetail)

The result I got is NA:

minor_allele_freq   refsnp_id
1                NA rs200322093

But I checked dbSNP website, for GRCH37, this snp does have a MAF value, which value is  0.0012.(http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=200322093)

Does anyone know the reason?

In addition, I notice in the latest version of the DB, it has a field called "minor_allele_freq_second", what's the difference between it and MAF?

Thank you.

 

 

 

R biomart MAF • 2.6k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

The webpage you reference shows information for multiple builds, so there is no reason to suspect that the MAF existed in GRCh37. In fact, this SNP comes from the 1000 genomes project, and if you look at the submission date (http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ss.cgi?subsnp_id=1289403710), you can see that it was in August 2014, so you should not expect to see information in the archived version of Ensembl from February 2014.

ADD COMMENT
0
Entering edit mode
Thomas Maurel ▴ 800
@thomas-maurel-5295
Last seen 21 months ago
United Kingdom

Hello,

You can get this information from our GRCh37 website as we have imported 1000 Genomes phase 3 and dbSNP 142 last March (http://www.ensembl.info/blog/2015/03/31/first-update-of-the-ensembl-grch37-site/).

> library("biomaRt")
> snpdetail=useMart("ENSEMBL_MART_SNP", dataset="hsapiens_snp",host="grch37.ensembl.org", path="/biomart/martservice",archive=FALSE)
> getBM(attributes=c("minor_allele_freq","refsnp_id"),filters="snp_filter",value="rs200322093",snpdetail)
  minor_allele_freq   refsnp_id
1        0.00119808 rs200322093

 

Hope this helps,

Regards,

Thomas

ADD COMMENT
0
Entering edit mode

Thanks. It works.

ADD REPLY
0
Entering edit mode

Hi, Thomas:

Is this link the latest version of hg19? I tried to get the MAF of SNP rs4829390 from your website. It gave me NA. But I checked dbSNP, the MAF should be 0.3796. Do you have a link of the latest version  using hg19?

Thank you.

ADD REPLY
0
Entering edit mode

Hello,

I am afraid we don't have MAF information for rs4829390 on our GRCh37 website: http://grch37.ensembl.org/index.html as we are displaying informations imported from dbSNP 142. We are planning to import dbSNP 144 for our next GRCh38 and GRCh37 release e!82 scheduled for late September 2015 (please keep an eye on our announce blog: http://www.ensembl.info).

Hope this helps,

Regards,

Thomas

 

ADD REPLY

Login before adding your answer.

Traffic: 574 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6