Question

Reading in and normalising ht-hgu133a and hgu133a files together

0

Entering edit mode

parisa_1986 • 0

@parisa_1986-7577

Last seen 9.7 years ago

United Kingdom

Hi,

I am trying to read in and normalise together hgu133a and ht-hgu133a CEL files from 2 different sources. However, I get the following error message:

> pathA<- "/raid/san/home_shared/parisa/Affy_2015/Batch1_Affy"

> datA<-ReadAffy(celfile.path=pathA)

Error in affyio::read_abatch(filenames, rm.mask, rm.outliers, rm.extra, :

Cel file /raid/san/home_shared/parisa/Affy_2015/5500024030401071707289.A03.CEL does not seem to have the correct dimensions

(The next step would then be rma normalisation: RMA1<-rma(datA))

An online search mentions that Affy would probably peg the CDF geometry off of the first CEL it happens to read, and checking the format of my files from the 2 different sources, they are different.

It is worth noting that the files I am having an issue with I had no problem reading in and RMA normalising on their own before - it is just now they are being combined. It is also worth noting, that there are ~21 extra probes for the ht-hgu133a files.

However, it’s not just a simple case of removing a few files, the files R is complaining about make up a large portion of my data set and should be included. The files should also be normalised together as they comprise a manually curated batch that would be skewed if the 2 files types were normalised separately.

I have provided 2 CEL files from each of the 2 different sources in the folder below:

https://www.dropbox.com/sh/1ujp3nnmiqwxkdr/AAAEx8IPZcH092vCDx9CxkHsa?dl=0&preview=5500024030401071707289.B06.CEL

I would appreciate suggestions on how to resolve this.

Thank you,

Parisa

hgu133a rma normalization affy • 1.9k views

ADD COMMENT • link updated 9.7 years ago by svlachavas ▴ 840 • written 9.7 years ago by parisa_1986 • 0

score 0 · Answer 1 · 2015-04-17

Dear Parisa,

what do you mean by different sources ? Maybe you mean different platforms ? Because i tried also to import them and the 4 CEL files belong to different platforms, so although im fresh in R i believe it is not appropriate to preprossess them together. You should try to import each of them as small batches, and then after normalization try to merge them by applying some batch effect correction(although also combine different expression sets it is not wise). You can check these links below, and i hope you could find some interesting ideas for your problem above:

http://www.researchgate.net/post/How_can_you_combine_different_published_expression_datasets_and_analyze_them_in_R

http://www.bioconductor.org/packages/2.12/bioc/html/inSilicoMerging.html

http://www.bioconductor.org/packages/release/bioc/html/frma.html