Finding/Understanding Affy 6.0 Annotation File
0
0
Entering edit mode
CantExitVIM ▴ 10
@cantexitvim-15274
Last seen 5.4 years ago

First, I apologize if this is a stupid question. I am a grad-student and this is the first-time working with such data.

Per individual, I have 4,257,405 SNP probes from chromosomes 1-22. All these probes are recorded by their  dbSNP RS IDs (e.g. rs13031737, rs62116682, etc). Some of the probes have questionable dbSNP RS IDs (chr2:132038:D, chr2:346071:I, etc) which relate to insertions and deletions but can't be searched for using the NHS database (https://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?). I am assuming these dbSNP RS IDs may have been modified/edited but I am not exactly sure.

While I do have positions for all these probes, I am trying to obtain the latest build/annotation. I was told this is the "older" affy 6.0 chip and found the annotation file (http://www.affymetrix.com/support/technical/byproduct.affx?product=genomewidesnp_6). Problem is, when I download the latest build (e.g. GenomeWideSNP_6 Annotations, CSV format, Release 35 (313 MB, 4/30/15)), it appears to not contain the SNP IDs found in my data-files. Additionally, that csv file contains less than 1-million unique probes whereas my SNP data contains over 4-million.

Lastly, the SNP-position file....which I am trying to update with a more recent build... contains positions for many more SNPs (9,249,128 total) and includes the insertion/deletion nomenclature.

Thus, looking for feedback about what I may be doing wrong.

affy affymetrix microarray SNP • 1.1k views
ADD COMMENT
0
Entering edit mode

It's not a stupid question, but it is asked in a way that precludes people from being able to answer. You have 4M SNPs that you got how? The Affy 6.0 SNP array has only 900K SNP probes, but very few people would use those probes as is, but instead would impute using either the HapMap (if it was done back in the day) or 1000Genomes data. The fact that you have over 4M SNPs leads me to believe that it has probably been imputed using 1000Genomes data, because HapMap, to my knowledge, only got to something like 1.6M SNPs.

If these are imputed data, then there is no profit in doing anything with Affy's annotation, because that is no longer applicable.

ADD REPLY

Login before adding your answer.

Traffic: 925 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6