Thank you for your message. Yes, I do use the target sequence for
as opposed to performing multiple blasts with each of the probe
I find this is faster. I am not suggesting that BLASTing all probe set
target sequences is an option, not yet at least - and I am not
this exercise for all genes. However, if there are genes that one is
interested in particularly, and a priori, it is worth it.
For example, I have genomic annotation for 5 genes (ie I know if they
amplifications or deletions in my tumour samples) and am looking at
genes on the Affy U133 chips very carefully (multiple probe sets
specificity) to assess correlations of expression with genomic status.
the MDM2 gene for example, a probe that was said to be _at really
pick up a sea of different transcript variants when assessed by BLAST
verified by subsequent multiple sequence alignment of the target
against the transcript variant sequences found).
Indeed, I agree, this is most likely because the information available
the time did not include these novel variants, but I had the
NetAffx was routinely updated against the current Unigene version. I
somewhat perplexed however, as it seems that although NetAffx is
some of the information is still based on the Unigene 133 version and
some cases the probe set display tool is not sufficiently up to date.
In any case, my initial message did not refer to a widespread issue
technology but aimed to raise a discussion on the issue of _at probe
unique-ness, an issue that I believe could have been dealt by in a
better way in NetAffx.
Many thanks for your attention, Lawrence
Lawrence Paul Petalidis
University of Cambridge
Department of Pathology
From: Laurent Gautier [mailto:firstname.lastname@example.org]
Sent: 26 March 2004 15:06
To: Lawrence Paul Petalidis
Cc: Michael Seewald; Johnnidis, Jonathan;
Subject: RE: [BioC] how deal with multiplicate affy probes?
On Thu, 2004-03-25 at 18:34, Lawrence Paul Petalidis wrote:
> As a note following on from Michael Seewald's message, I totally
> there is a STRONG need to BLAST probe set sequences.
Do we really need to use BLAST (then how would we decide on cut-off
values) ? The short probes are short oligonucleotides, so I think
perfect string matches are likely to be enough in many cases.
> I tend to use the probe
> set target sequence instead of the indicidual probe sequences
At the risk of looking silly, may I ask you to detail a bit (I am not
certain to understand... do you mean that you prefer working with the
target sequence a given probe set is supposed to match ?... then you
BLAST it against the rest of the world ?)
> will be surprised to see the inconsistency of the Affy annotation,
> cases _at probes are really not unique at all.
I have spent some time damaging my sight by looking at how Affymetrix
probes match reference sequences, and I would not be so fast at
the stone at them. What is there is not perfect (there are obvious
1) it was done some time ago (the Dorian Gray syndrome referred in a
previous mail)... and your very own "BLASTs" (or whatever else) could
suffer from the same problem in some time
2) in some cases suspect that the people at Affymetrix did combined
different sources of information to create the probes in a probe set
(ex: a gene with tentatively 2 different isoforms, and two
entries GENBANK, can lead to a unique probe set by setting the probes
appropriate locations.... whether it is relevant to merge two
isoforms into one goes can then be discussed, but that a different
> So if you are really
> interested in a transcript, BLAST it to make sure you are actually
> what you think you are.
The notion of "alternative mappings" implemented in the package
'altcdfenvs' is worth a look. Staring at probe matches is probably not
the idea of fun many people have, but apparently some start to do it
their favorite genes. I believe that a community-based mapping could
benefit... well... the community...
> Best regards to all, Lawrence
> Lawrence Paul Petalidis
> Ph.D. Candidate
> University of Cambridge
> Department of Pathology
> -----Original Message-----
> From: email@example.com
> [mailto:firstname.lastname@example.org]On Behalf Of Michael
> Sent: 25 March 2004 20:48
> To: Johnnidis, Jonathan
> Cc: email@example.com
> Subject: Re: [BioC] how deal with multiplicate affy probes?
> As a rule of thumb: If statistics based on a given probe set data
> that a transcript is significantly deregulated, you can usually
> discard every other probe set for that transcript!
> The thing to look at is the probe design itself: Download the probe
> NetAffx and blast the single probes agains the genome (e.g. in
> will be surprised, how many probes match up with introns or genomic
> that do not correspond to any cDNA!
> 2 examples: There are 4 probe sets for human Wnt6 (HG-U133AB), 2
> the sense (!) strand and have to be discarded. Out of >12 probe sets
> CD44, only 4 have probes that are completely matching the
> be discarded.
> PS: www.ensembl.org is always a good place to check probe sets.
> of probe sets does not show the location of single probes, though...
> PPS: In affymetrix.com you can check out the "Details" view for a
> There you can discover, that 2 probe sets of Wnt 6 map to the (-)
> which is bad. It doesn't tell you, however, that many probe sets
> On Sat, 20 Mar 2004, Johnnidis, Jonathan wrote:
> > I'm a new list member and am not quite sure if this question is
> > for the list, but will shoot anyway. I'm analyzing a bunch of data
> > MgU74Av2 chips and am a bit perplexed as to how to treat
> > expression data from multiplicate probe sets (that is a gene that
> > probe set designed against it (for example, 97569_r_at and
> > both probes for the Insulin gene).
> Bioconductor mailing list
> Bioconductor mailing list