Duplicate probesets in mouse4302

0

Entering edit mode

Paul Geeleher ★ 1.3k

@paul-geeleher-2679

Last seen 11.4 years ago

I'm analyzing a dataset from a mouse4302 array and I'm interested in looking at the expression levels of the various caspases over the 8 timepoints of the experiment. The thing is the mouse4302.db annotation package seems to, in many cases, map several different probesets to the same genename. For example 3 genesets have genename caspase 3; 3 genesets are called caspase 9 and there are duplicates for caspases 8, 14, 7 and more. The next problem is that there is in most cases very little correlation between the expression levels of the different genesets which are meant to represent the same thing. For example, caspase 3 as approximate average log expression levels of 5.2, 9.1 and 10.2 in the different probesets, which seem hugely different. There does however seem to be some correlation in the change of expression level over the time course. My question is basically first, why is this the case? And second, how should I deal with it to get some idea what is happening biologically? -- Paul Geeleher School of Mathematics, Statistics and Applied Mathematics National University of Ireland Galway Ireland

mouse4302 mouse4302 • 1.3k views

ADD COMMENT • link updated 17.0 years ago by Henrik Bengtsson ★ 2.4k • written 17.0 years ago by Paul Geeleher ★ 1.3k

0

Entering edit mode

Henrik Bengtsson ★ 2.4k

@henrik-bengtsson-4333

Last seen 20 months ago

United States

Hi. On Mon, Feb 2, 2009 at 9:28 AM, Paul Geeleher <paulgeeleher at="" gmail.com=""> wrote: > I'm analyzing a dataset from a mouse4302 array and I'm interested in > looking at the expression levels of the various caspases over the 8 > timepoints of the experiment. The thing is the mouse4302.db annotation > package seems to, in many cases, map several different probesets to > the same genename. For example 3 genesets have genename caspase 3; 3 > genesets are called caspase 9 and there are duplicates for caspases 8, > 14, 7 and more. > > The next problem is that there is in most cases very little > correlation between the expression levels of the different genesets > which are meant to represent the same thing. > > For example, caspase 3 as approximate average log expression levels of > 5.2, 9.1 and 10.2 in the different probesets, which seem hugely > different. There does however seem to be some correlation in the > change of expression level over the time course. Microarray studies are almost all about comparing signals to a reference. The reference might be the "normal" in a tumor/normal pair, another sample, or a pool of many samples. The idea is that by taking ratios toward a reference, then gene/probeset/probe/locus/... specific scale factors ("affinities") cancels out. Your "expression levels" are not relative to a reference and for this reason contain traces of such (unknown) affinities. The different "expression levels" probably carry quite different affinities, and are therefore not comparable. This is why "expression levels" on their own does not make sense. However, when you say there is "some correlation in the change", you are comparing to a common reference (time point or relative to each other; not sure how you did it). Actually, I guess all measurements in Universe require some reference in order to make any sense. My $.02 /Henrik > > My question is basically first, why is this the case? And second, how > should I deal with it to get some idea what is happening biologically? > > -- > Paul Geeleher > School of Mathematics, Statistics and Applied Mathematics > National University of Ireland > Galway > Ireland > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 17.0 years ago Henrik Bengtsson ★ 2.4k

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 13 days ago

United States

On Mon, Feb 2, 2009 at 12:28 PM, Paul Geeleher <paulgeeleher@gmail.com>wrote: > I'm analyzing a dataset from a mouse4302 array and I'm interested in > looking at the expression levels of the various caspases over the 8 > timepoints of the experiment. The thing is the mouse4302.db annotation > package seems to, in many cases, map several different probesets to > the same genename. For example 3 genesets have genename caspase 3; 3 > genesets are called caspase 9 and there are duplicates for caspases 8, > 14, 7 and more. > Hi, Paul. This is the rule rather than the exception (not in number, but in principle). These are different probesets designed against potentially different transcripts of the same gene. > > The next problem is that there is in most cases very little > correlation between the expression levels of the different genesets > which are meant to represent the same thing. > > For example, caspase 3 as approximate average log expression levels of > 5.2, 9.1 and 10.2 in the different probesets, which seem hugely > different. There does however seem to be some correlation in the > change of expression level over the time course. > Keep in mind that the absolute expression level is not at all comparable between probesets. For the purposes of comparison, it is only valid and useful to compare withing a probeset. There are many reasons for this including cross-hybridization and probe design issues. You suggest that a correlation over time exists between probesets; this is a good thing. That said, some probesets will NOT show a correlation over time with other probesets putatively measuring the same gene. Again, the reasons for this can be multiple. It is not that uncommon, either. Sean [[alternative HTML version deleted]]

ADD COMMENT • link 17.0 years ago Sean Davis 21k

Login before adding your answer.