Question

Combining old Affy arrays for limma

0

Entering edit mode

james perkins ▴ 300

@james-perkins-2675

Last seen 9.6 years ago

Apologies if this topic has already come up (I suspect it has!), I did try searching the mailing list but to no avail; I am combining some old affy datasets from 2002, which I have obtained via GEO. They are of the "Gene Expression Data Matrix" format; i.e. each array has been background processed, summarised etc. However no differential expression has been undertaken. The datasets are of the "RGU-34" type, so there are 3 arrays with 8,000 features, RGU34A, RGU34B and RGU34C respectively. I want to combine this data with some more recent data from RAT2302 arrays. I understand I can combine the identifiers using some spreadsheets downloaded from the affymetrix website. However my problem is how I work out differential expression for comparison. Is it correct to combine the Gene Expression Data matrices for A B and C to make big esets of (8,000 * 3 =) 24,000 identifiers, and then use these in Limma to work out differential expression? Or would it be better to treat the arrays separately, then combine the results, ordering by p-value. The reason I want to combine the 3 RGU arrays is because I wish to (amongst other things) perform category analysis to make a comparison between the old data and the new data. Any information on the best way to combine these arrays would be much appreciated. Kindest regards, James Perkins

rat2302 rgu34a rgu34b rgu34c affy limma Category rat2302 rgu34a rgu34b rgu34c affy • 816 views

ADD COMMENT • link updated 15.5 years ago by Sean Davis 21k • written 15.5 years ago by james perkins ▴ 300

score 0 · Answer 1 · 2008-11-03

On Mon, Nov 3, 2008 at 5:58 AM, James Perkins <jperkins@biochem.ucl.ac.uk>wrote: > Apologies if this topic has already come up (I suspect it has!), I did try > searching the mailing list but to no avail; > > I am combining some old affy datasets from 2002, which I have obtained via > GEO. They are of the "Gene Expression Data Matrix" format; i.e. each array > has been background processed, summarised etc. However no differential > expression has been undertaken. > > The datasets are of the "RGU-34" type, so there are 3 arrays with 8,000 > features, RGU34A, RGU34B and RGU34C respectively. I want to combine this > data with some more recent data from RAT2302 arrays. I understand I can > combine the identifiers using some spreadsheets downloaded from the > affymetrix website. > > However my problem is how I work out differential expression for > comparison. Is it correct to combine the Gene Expression Data matrices for A > B and C to make big esets of (8,000 * 3 =) 24,000 identifiers, and then use > these in Limma to work out differential expression? Or would it be better to > treat the arrays separately, then combine the results, ordering by p-value. If the three arrays have very few probesets in common, then normalizing separately and then combining for differential expression will work just fine. Just FYI, there are numerous posts in the archives on this topic. Sean > > > The reason I want to combine the 3 RGU arrays is because I wish to (amongst > other things) perform category analysis to make a comparison between the old > data and the new data. [[alternative HTML version deleted]]