Resummarisation of Exon array normalised at probeset level to gene level
1
0
Entering edit mode
i.sudbery ▴ 40
@isudbery-8266
Last seen 7 weeks ago
European Union

We wish to analyse an Exon Array dataset we obtained from a public source (unfortunately not GEO). The data we have is a matrix of RMA normalised expression values from some 400 Exon arrays summarized at the probeset level. We are only interested in the gene level and wondered if there is any way to summarize to the gene level from this starting point?

normalization oligo limma • 1.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States

The short answer is no. To summarize at the gene level using RMA requires the probe-level data. You could hypothetically group all the probesets for a given gene together and then summarize in some fashion, but the resulting values would not be the same as what you would get if you summarized using RMA.

ADD COMMENT
0
Entering edit mode

Do you think if I took the mean of probes for each gene, that the resulting values would be valid for downstream limma analysis?

ADD REPLY
0
Entering edit mode

I actually have no idea. It's certainly one thing you can do, and it might not be the worst idea in the world, but ideally you would do some conventional EDA (exploratory data analysis) first to see if it looks like taking means is a reasonable thing to do.

An alternative would be to make comparisons at the probeset (exon-ish) level, and look for consistent differences over the set of probesets for each gene. The downsides to that approach are that the probesets only have (usually) four probes each, and the Exon arrays are much dimmer than the old 3'-biased arrays, so you have to wonder about the signal to noise ratio with just four dim probes per probeset. You also increase the multiplicity burden quite a bit, which will not help things at all.

Ideally you would go back to whomever submitted the data, and they would be oh so happy to supply you with the original celfiles. Is that in the cards?

ADD REPLY
0
Entering edit mode

I'm going to ask. But the data comes from a massive consortium, and has been around for some time without being uploaded to GEO or similar, or even published. They have lots of data sets I'd like to get my hands on, but only make summarized versions of all of them availible, which is annoying because I'd really like to study genes that they have excluded from their analyses.

ADD REPLY

Login before adding your answer.

Traffic: 560 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6