issue with mta10.r1.genecdf: eset results don't make sense
1
0
Entering edit mode
@juliayuecui2011-8112
Last seen 8.9 years ago
United States

Hi James,

I got some mta 1.0 arrays, and I compared my R script output with the Affymetrix Expression Console output at the gene level. The results do not match. The Affymetrix Expression Console output appears to be correct, because the positive controls worked. However, none of the results from R output make sense. 

In RStudio, I did this and all went through smoothly: 

library(makecdfenv)

make.cdf.package("MTA-1_0.r1.gene.cdf", species = "Mus_musculus")

install.packages("mta10.r1.genecdf/", repos = NULL, type = "source")

library(affy)

library(mta10.r1.genecdf)

data <- ReadAffy(cdfname="mta10.r1.genecdf")

annotation(data) <- "mta10.r1.genecdf"

eset <- rma(data)

e<-exprs(eset) 

There are 71293 rows in e. 

Then I used annaffy and mta10sttranscriptcluster.db to convert the probe IDs (TCxxxxx) to gene symbols etc. 

However, even just simply looking at the eset and e files in R, the results do not make sense at all; using the TCxxxx probe IDs, and the Affymetrix's recently released CSV file (MTA-1_0.na35.mm10.transcript), I was able to find the gene symbols for many known highly expressed and lowly expressed genes in my particular samples; however, the probe densities all look pretty much the same. In Affymetrix Expression Console, however, these genes behave the way they should.

My suspicion is that there is something wrong with the CDF file or I should not use rma? 

 

Thanks for your help.

annotation • 948 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 5 hours ago
United States

I should probably fix the error message in affy to include the HTA and MTA probes. The affy package should really only be used for the old 3'-biased arrays. For everything else you should use either oligo or xps. To use oligo you would do:

library(oligo)

dat <- read.celfiles(list.celfiles())

eset <- rma(dat)

And depending on what you want to do, you might want to summarize at different levels. The MTA arrays are very complex, and can hypothetically be used to detect differential splicing. But there is nothing in Bioconductor that I know of that is designed for that sort of analysis. If you simply want to measure differential expression, then the code above will summarize the data at the transcript level, and you can then use e.g., limma to make comparisons between different groups.

ADD COMMENT

Login before adding your answer.

Traffic: 894 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6