I am working with Affymatrix ( U133A 2.0 chip) data for the first time. The data has been background corrected and normalized and intended to be used to compare different groups. From examining the e.coli Lrp example, I believe this data typically exists in the .CEL format with the mean expression levels and standard deviation.
Problem is, I've been given data that has been parsed; each row represents a probe and each column represents a mean expression value for a particular subject. There is no standard deviation measurement included in this data. To highlight this, I posted a very small portion of the data below (there are many more subjects and probes).
probe_1 | subject_1 | subject_2 | subject_3 |
1598_g_at | 1.28409 | 1.34388 | 1.34706 |
160020_at | 2.88587 | 2.84006 | 2.78932 |
From what I understand, the affy package typically will read .CEL files [e.g. read.affybatch(), ReadAffy()], which limma will work with. But as I don't have the files, I am a bit perplex at how to approach this problem. Initially, I thought I could reconstruct the .CEL files by using the e.coli Lpr example, but noticed that the standard deviations were different in each files (I am a grad student.... my initial thought was that the standard deviation was calculated from different sample expression levels. But this doesn't seem to be the case). Thus, without the STDV value, I feel like I may be missing something crucial.
Looking for advice
Thank you!
Thank you for the quick reply,
So there's no problem, you just read the matrix into R and use it in limma as usual.
I was able to load the data utilizing ExpressionSet(). As this is a biobase function, I am not sure if this is the "usual" way to load data. But lmFit seems to handle the data just fine (coercing that object with getEAWP() doesn't seem to change anything). Is there anything else I should be aware of?
I don't follow your comments about STDV. Processing CEL files does not produce a STDV value, nor is such a value required by limma, so I don't follow what the problem is.
As I was unaware of what affy data looked like, I followed the " Lrp Mutant E. Coli Strain with Affymetrix Arrays" tutorial in the limma user guide (Section 17.1). After dowloading the .CEL files (http://bioinf.wehi.edu.au/limma/data/ecoli-lrp.zip) , I noticed they have a STDV column. Below is an except of one of the .CEL files.
Thank you for all your help!
It's even easier than that. As I said in my answer, you can just read the file and give it to limma. See the edits to my answer above.
Modern CEL files are all binary and you will not see any entries like those for the Lrp data.