Question

analysis of HG_U95A vs. HG_U95Av2

0

Entering edit mode

Alex Tsoi ▴ 260

@alex-tsoi-2154

Last seen 10.2 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070718/ f6105e1a/attachment.pl

• 630 views

ADD COMMENT • link updated 17.4 years ago by Rob Scharpf ▴ 250 • written 17.4 years ago by Alex Tsoi ▴ 260

score 0 · Answer 1 · 2007-07-18

Alex Tsoi wrote: > Dear all, > > I have a cancer dataset from GEO that labeled as having the platform GPL 91 > (HG-U95A), and when I use justRMA() to read the data, I realize that the > GSMs are from HG_U95A and HG_U95Av2, and that gives me the error. I could > separately analyze the data but I just want to ask if anyone has experience > or comments about the difference between the two platforms AND could I seem > the data coming from one platform, and analyze them (eg. by using RMA); of > course if that's the case I have to "make" R believe that they are coming > from only one platform. Or what's the most proper way to analyze these kinds > of data ? > > Greatly appreciate for the help and the comments > > P.S.: this is a cancer dataset, with two types of disease state, and each > type could be either come from the HG-U95A or HG_U95Av2 > > This is a difficult problem since there are platform specific effects. For example, you might think that a probeset which is shared between the two platforms would be safe to compare, but unfortunately, it will behave slightly differently on one platform than on the other. Even though in theory this is measuring the same thing. You could start by just normalizing these two array types in separate pools. Then you could take probesets that are supposedly shared between them and look to see how they are behaving in their respective conditions. In general, I expect you will find that shared probesets to move the same direction on each platform under your experimental conditions, but that you get different absolute results on one platform than on another for a given condition. In other words, both the condition and the platform will contribute to the overall signal. The easiest thing is always to look at one platform at a time, but if you *must* combine them, grab a statistician 1st to try and help you to do something sensible. good luck, Marc

score 0 · Answer 2 · 2007-07-18

My take on the matter (from the very distant past, mind you, so none of the packaged material there will work now or with any recent version of the software) is here: http://bmbolstad.com/misc/mixtureCDF/MixtureCDF.html The key point there being that the differences between U95A and U95Av2 are fairly small (there is only a relative handful of probesets which differ between the two, I think something like 25 out of ~12600). Best, Ben On Wed, 2007-07-18 at 18:27 -0400, Alex Tsoi wrote: > Dear all, > > I have a cancer dataset from GEO that labeled as having the platform GPL 91 > (HG-U95A), and when I use justRMA() to read the data, I realize that the > GSMs are from HG_U95A and HG_U95Av2, and that gives me the error. I could > separately analyze the data but I just want to ask if anyone has experience > or comments about the difference between the two platforms AND could I seem > the data coming from one platform, and analyze them (eg. by using RMA); of > course if that's the case I have to "make" R believe that they are coming > from only one platform. Or what's the most proper way to analyze these kinds > of data ? > > Greatly appreciate for the help and the comments > > P.S.: this is a cancer dataset, with two types of disease state, and each > type could be either come from the HG-U95A or HG_U95Av2 >

score 0 · Answer 3 · 2007-07-19

On Jul 19, 2007, at 6:00 AM, bioconductor-request at stat.math.ethz.ch wrote: > Alex Tsoi wrote: >> Dear all, >> >> I have a cancer dataset from GEO that labeled as having the >> platform GPL 91 >> (HG-U95A), and when I use justRMA() to read the data, I realize >> that the >> GSMs are from HG_U95A and HG_U95Av2, and that gives me the error. >> I could >> separately analyze the data but I just want to ask if anyone has >> experience >> or comments about the difference between the two platforms AND >> could I seem >> the data coming from one platform, and analyze them (eg. by using >> RMA); of >> course if that's the case I have to "make" R believe that they are >> coming >> from only one platform. Or what's the most proper way to analyze >> these kinds >> of data ? >> >> Greatly appreciate for the help and the comments >> >> P.S.: this is a cancer dataset, with two types of disease state, >> and each >> type could be either come from the HG-U95A or HG_U95Av2 >> >> > This is a difficult problem since there are platform specific effects. > For example, you might think that a probeset which is shared > between the > two platforms would be safe to compare, but unfortunately, it will > behave slightly differently on one platform than on the other. Even > though in theory this is measuring the same thing. > > You could start by just normalizing these two array types in separate > pools. Then you could take probesets that are supposedly shared > between > them and look to see how they are behaving in their respective > conditions. In general, I expect you will find that shared > probesets to > move the same direction on each platform under your experimental > conditions, but that you get different absolute results on one > platform > than on another for a given condition. In other words, both the > condition and the platform will contribute to the overall signal. The > easiest thing is always to look at one platform at a time, but if you > *must* combine them, grab a statistician 1st to try and help you to do > something sensible. I would begin as Marc suggests, and then explore integrative correlation as a way to identify a reproducible set in 2 or more studies (platforms). See the MergeMaid package and the references therein. Once you have identified a reproducible set, you may want to explore the packages metaArray and GeneMeta. Rob