Question: Analyse of HGU133 set / Merge ExpressionSet
0
3.6 years ago by
giroudpaul40
France
giroudpaul40 wrote:

Hello,

I am currently trying to extract data from a GEO dataset which has been done on the Affymetrix HGU133 plateforme, meaning that all samples were done on two different chip : hgu133a and hgu133b.

Applying the advises presented here :

Combining HGU133A & HGU133B data

A: Methods to combine U133A and B?

https://www.biostars.org/p/56657/

I did the two RMA normalization separately, meaning that I have now two expression sets :

data.rmaA = rma(dataA)
data.rmaB = rma(dataB)

The next step would be to combine these two expression sets into one before continuing my analysis with Limma.

What is the best solution to perform this ? What about redundant genes ? Will I be able to compare expression levels between the hgu133a and hgu133b although the normalization was done separately ?

EDIT :

Actually, this is apparently a common question that is not specific to my case. The trouble is that the solutions I found always are for combining different chips, whereas in my case it is two "sister" chip that has been done in parallel (same samples, at the same time).

So what is the simpliest way to do this ? Is it better to use new("exprSet", ....) with the arguments patched together from the individual two HGU133A and HGU133B exprSets as suggested by Wolfgang Huber in the first link I provided.

I also found this post where they use a4Base package : Combining ExpressionSet objects : error with function merge() in "inSilicoMerging"

There is also the combine.eSet() method Concatenating or merging two or more ExpressionSet objects

and finally I also found the inSilicoMerging package, which seems to be more complete but is designed for different studies combination, so not sure if applicable here (Merge different datasets and perform differential expression analysis in limma

> data.rmaA
ExpressionSet (storageMode: lockedEnvironment)
assayData: 22283 features, 15 samples
element names: exprs
protocolData
sampleNames: GSM115046_M0_D1_chipA.CEL GSM115047_M0_D2_chipA.CEL ...
GSM115060_M2_D3_chipA.CEL (15 total)
varLabels: ScanDate
phenoData
sampleNames: GSM115046_M0_D1_chipA.CEL GSM115047_M0_D2_chipA.CEL ...
GSM115060_M2_D3_chipA.CEL (15 total)
varLabels: sample
featureData: none
experimentData: use 'experimentData(object)'
Annotation: hgu133a
> data.rmaB
ExpressionSet (storageMode: lockedEnvironment)
assayData: 22645 features, 15 samples
element names: exprs
protocolData
sampleNames: GSM115061_M0_D1_chipB.CEL GSM115062_M0_D2_chipB.CEL ...
GSM115075_M2_D3_chipB.CEL (15 total)
varLabels: ScanDate
phenoData
sampleNames: GSM115061_M0_D1_chipB.CEL GSM115062_M0_D2_chipB.CEL ...
GSM115075_M2_D3_chipB.CEL (15 total)
varLabels: sample
featureData: none
experimentData: use 'experimentData(object)'
Annotation: hgu133b 

hgu133a hgu133b affy • 1.0k views
modified 3.6 years ago by James W. MacDonald51k • written 3.6 years ago by giroudpaul40

Well, neither of a4base:combineTwoExpressionSet and inSillicoMerging methods works for me.

The first one give this error, probably because it is not meant to merge expressionsets from different chips with different number of probes

data.rmaAB = combineTwoExpressionSet(data.rmaA, data.rmaB)
Error in cbind(assayData(x)$exprs, assayData(y)$exprs) :
number of rows of matrices must match (see arg 2)

The inSilicoMerging don't work for me either, because he looks for common probes between the exprset. And he's not happy because there is only 168 of them (qc.probes from Affymetrix)

Still, when using plotMDS and plotRLE from this same package, point out a batch effect between hgu133A and hgu133B exprsets. So I still need to perform so kind of normalization between the two right ?

Answer: Analyse of HGU133 set / Merge ExpressionSet
1
3.6 years ago by
United States
James W. MacDonald51k wrote:

You should note that there are almost no overlapping genes being measured on those two arrays, so there is no need to combine. In other words, processing and analyzing the two arrays separately is going to give you the same results that you would get if you were able to combine, so there's no profit in combining.