Hi James,
I got some mta 1.0 arrays, and I compared my R script output with the Affymetrix Expression Console output at the gene level. The results do not match. The Affymetrix Expression Console output appears to be correct, because the positive controls worked. However, none of the results from R output make sense.
In RStudio, I did this and all went through smoothly:
library(makecdfenv)
make.cdf.package("MTA-1_0.r1.gene.cdf", species = "Mus_musculus")
install.packages("mta10.r1.genecdf/", repos = NULL, type = "source")
library(affy)
library(mta10.r1.genecdf)
data <- ReadAffy(cdfname="mta10.r1.genecdf")
annotation(data) <- "mta10.r1.genecdf"
eset <- rma(data)
e<-exprs(eset)
There are 71293 rows in e.
Then I used annaffy and mta10sttranscriptcluster.db to convert the probe IDs (TCxxxxx) to gene symbols etc.
However, even just simply looking at the eset and e files in R, the results do not make sense at all; using the TCxxxx probe IDs, and the Affymetrix's recently released CSV file (MTA-1_0.na35.mm10.transcript), I was able to find the gene symbols for many known highly expressed and lowly expressed genes in my particular samples; however, the probe densities all look pretty much the same. In Affymetrix Expression Console, however, these genes behave the way they should.
My suspicion is that there is something wrong with the CDF file or I should not use rma?
Thanks for your help.