How is hgu133plus2.db database made out of the affymetrix probeset information
1
0
Entering edit mode
@arkajyotibhattacharya-11976
Last seen 7.4 years ago

Hi,

I was trying to get the gene mapping for one of the datasets in Affymetrix. I found a dissimilarity between Affymetrix file and the database in hgu133plus2. 

As per example, For the probeset "1553011_at", I have found a mapping to both these ENTREZ ID's 6872 /// 138474 in the file downloaded from AFfymetrix website. But in the hgu133plus2 database it is only mapped to 138474. Moreover, in all the cases where a probeset is mapped to multiple genes in the Affymetrix file, only one of them is chosen as mapped gene in the hgu133plus2. I wish to know in what basis the other genes are ommitted from mapping. Was there any algorithm to select the mapped gene among the multiple one's?

The link to the Affymetrix file:- http://www.affymetrix.com/Auth/analysis/downloads/na36/ivt/HG-U133_Plus_2.na36.annot.csv.zip

Regards,

Arkajyoti Bhattacharya

hgu133plus2.db • 1.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States

The default way to map the Affy probeset IDs to Entrez Gene is to extract the 'Representative Public ID' from the csv file, along with the Entrez Gene ID. The Representative Public ID is either a GenBank or RefSeq ID, which in the case of this array is NM_153809.1. This is used to map the probeset ID to the Entrez Gene ID.

The Entrez Gene IDs are also parsed out of the annotation csv file, but they are only used to map probesets for which the primary mapping failed. In this case, NM_153809 maps to Entrez Gene ID 138474, so that's all you get from the ChipDb package.

 

 

ADD COMMENT

Login before adding your answer.

Traffic: 684 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6