Question: duplicate genes in Affy arrays
0
gravatar for jsv@stat.ohio-state.edu
14.4 years ago by
Is there any general procedure for handling duplicate genes in Affy arrays? For example, for the hu6800 array which has 7129 probe sets, there are 869 genes that are represented by more than one probe set, with one gene (ACTB) being represented by 9 probe sets. g.symbols=aafSymbol(X.gnames,"hu6800") ug.symbols <- unlist(g.symbols) length(ug.symbols) #6980 (7129-6980 = 149 with no symbols) symbol.usage <- table(ug.symbols) sum(symbol.usage>1) # 869 max(symbol.usage) #9 Ignoring this would seem to invalidate a number of multiple comparison procedures. Is it reasonable to average probe set expression levels for the same gene? Are there any "pre-processing" routines that address this issue? The flip side of this question is "Do probe sets with the same gene symbol really specify the same gene? Does it matter which annotational method is used to name genes?"
hu6800 probe • 440 views
ADD COMMENTlink modified 14.4 years ago by Suresh Gopalan60 • written 14.4 years ago by jsv@stat.ohio-state.edu30
Answer: duplicate genes in Affy arrays
0
gravatar for Suresh Gopalan
14.4 years ago by
Suresh Gopalan60 wrote:
Hi I don't know if there is a consensus on this issue yet. When I did dealt with this to do some categorial over representation analysis, I used the 3' most probeset www.pnas.org/cgi/doi/10.1073/pnas.0501211102) There are pitfalls to this approach also, though biologically sound. The other approach I have seen implemented in one software is to use the probeset with highest expression. As to the last question, it depends. Based on published articles using whole genome tiling arrays and listening to the current interpretation, the answer could be tricky. Suresh Suresh Gopalan, Ph.D. ----- Original Message ----- From: <jsv@stat.ohio-state.edu> To: <bioconductor at="" stat.math.ethz.ch=""> Sent: Thursday, August 18, 2005 7:50 AM Subject: [BioC] duplicate genes in Affy arrays > Is there any general procedure for handling duplicate genes in Affy > arrays? > > For example, for the hu6800 array which has 7129 probe sets, > there are 869 genes that are represented by more than one probe set, > with one gene (ACTB) being represented by 9 probe sets. > > g.symbols=aafSymbol(X.gnames,"hu6800") > ug.symbols <- unlist(g.symbols) > length(ug.symbols) #6980 (7129-6980 = 149 with no symbols) > symbol.usage <- table(ug.symbols) > sum(symbol.usage>1) # 869 > max(symbol.usage) #9 > > Ignoring this would seem to invalidate a number of multiple comparison > procedures. Is it reasonable to average probe set expression levels for > the same gene? Are there any "pre-processing" routines that address this > issue? > > The flip side of this question is "Do probe sets with the same gene symbol > really specify the same gene? Does it matter which annotational method is > used to name genes?" > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENTlink written 14.4 years ago by Suresh Gopalan60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 461 users visited in the last hour