Entering edit mode
Johnnidis, Jonathan
▴
50
@johnnidis-jonathan-689
Last seen 10.2 years ago
Dear List,
I remain unsure of how to deal with multiplicate Affymetrix ProbeSets
(in my analysis I need to assign a single fold-change status to each
_gene_, not merely each ProbeSet).
Some have suggested that, given two or more ProbeSets for a given gene
(e.g. 97569_r_at and 97658_r_at on the MgU74Av2 chip for Insulin), if
at least one ProbeSet shows significant foldchange in an experiment
(and, say the other(s) show no fold-change), it is fair to regard that
locus as diferentially regulated and ignore the other ProbeSet(s).
While this is an attractive solution (one maximizes the number of
potentially diferentially regulated loci), I remain unconvinced that
it is scientifically acceptable: doesn't that bias the results of the
experiment based on the experimental variable you are trying to test?
Wouldn't that be quite dangerous?
If there are any thoughts to the contrary, I'd be very interested to
hear them.
Perhaps there are other solutions to dealing with multiplicate
ProbeSets? For example, one might use a criterion which is less-biased
to select the best ProbeSet within a multiplicate group of ProbeSets.
Those criteria could include the ProbeSet that has the highest
signal(i.e. expression) value, or the best 3'-5' ratio, or the highest
genomic alignment fidelity?
So basically the question is how best to 'summarize' a group of
ProbeSets (as opposed to previous and ongoing debates on how to
summarize individual probes within a ProbeSet).
with thanks for any further discussion,
Jonathan
-----Original Message-----
From: Michael Seewald [mailto:mseewald@gmx.de]
Sent: Thursday, March 25, 2004 3:48 PM
To: Johnnidis, Jonathan
Cc: bioconductor@stat.math.ethz.ch
Subject: Re: [BioC] how deal with multiplicate affy probes?
As a rule of thumb: If statistics based on a given probe set data
tells you,
that a transcript is significantly deregulated, you can usually trust
it and
discard every other probe set for that transcript!
The thing to look at is the probe design itself: Download the probe
set from
NetAffx and blast the single probes agains the genome (e.g. in
ensembl). You
will be surprised, how many probes match up with introns or genomic
regions
that do not correspond to any cDNA!
2 examples: There are 4 probe sets for human Wnt6 (HG-U133AB), 2 match
with
the sense (!) strand and have to be discarded. Out of >12 probe sets
for human
CD44, only 4 have probes that are completely matching the transcripts.
>8 can
be discarded.
Best,
Michael
PS: www.ensembl.org is always a good place to check probe sets. Their
mapping
of probe sets does not show the location of single probes, though...
PPS: In affymetrix.com you can check out the "Details" view for a
probe set.
There you can discover, that 2 probe sets of Wnt 6 map to the (-)
strand,
which is bad. It doesn't tell you, however, that many probe sets match
intron
regions.
On Sat, 20 Mar 2004, Johnnidis, Jonathan wrote:
> I'm a new list member and am not quite sure if this question is
appropriate
> for the list, but will shoot anyway. I'm analyzing a bunch of data
from Affy
> MgU74Av2 chips and am a bit perplexed as to how to treat conflicting
> expression data from multiplicate probe sets (that is a gene that
has >1
> probe set designed against it (for example, 97569_r_at and
97658_r_at are
> both probes for the Insulin gene).