Question: observations on affyprobeminer
0
Mark W Kimpel • 830 wrote:
I have recently explored the use of alternative CDFs from
affyprobeminer
(APM) or a 36 array dataset derived using the Affy rat2302 chipset. I
used
both the Affy cdf and the transcript-level affyprobeminer cdf. I
preprocessed using RMA, filtered using an A/P filter, and
statistically
analyzed using an appropriate lme model followed by qvalue FDR
correction. I
set my FDR threshold at 5%. I eliminated duplicate genes by picking
the one
with the lowest p-value.
Using the Affy cdf, I got ~2000 sig. genes, which APM ~1000. If I
choose
only those EntrezGene identifiers present on both cdfs, my number sig.
with
the APM cdf was ~1000 and there was a 90% overlap with the Affy sig.
list.
My conclusion from the latter observation is that I am measuring
largely the
same transcripts/genes with both CDFs.
I was interested in the ~1000 genes which are annotated with the Affy
CDF
but not the APM cdf. Following the logic behind APM, I would assume
that
these would be largely incorrectly annotated probesets or probesets
that are
not really measuring any "real" transcript. This list should, then,
consist
largely of random genes. To test this hypothesis, I used the Category
package to test for over-representation of GO and KEGG categories in
my
various lists. What I found was a huge degree of overlap between: 1.
the
affy genes also annotated with APM, 2. the affy genes not annotated
with
APM, 3. the genes derived solely from APM.
My conclusion from this latest observation is that APM is not
annotating a
large number of genes/transcripts that are in fact real. Assuming that
APM
is correctly throwing out some "junk" probesets, is it throwing out
the baby
with the bathwater?
I'd be interested to hear the thoughts and experiences of others. I've
certainly run into occasions where Affy annotated probesets turn out
to
represent introns or something other than they purport to be, and I
was
hoping that APM would solve this problem, but I don't want to use it
if it
means a massive loss of truly significant data.
Mark
--
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine
15032 Hunter Court, Westfield, IN 46074
(317) 490-5129 Work, & Mobile & VoiceMail
(317) 663-0513 Home (no voice mail please)
**************************************************************
[[alternative HTML version deleted]]
ADD COMMENT
• link
•
modified 11.6 years ago
by
Hongfang Liu • 10
•
written
11.6 years ago by
Mark W Kimpel • 830

