Question: perform probe level RMA without summarization
gravatar for jxd_84
15 months ago by
jxd_8410 wrote:

I'm trying to perform normalization on a set of Affy U133A chips. I'm using the affy library on R and was able to do the regular probeset level normalization using ReadAffy() and rma() on the affybatch object.

But now I want to do the same thing but for the probe level data. The problem is, I cannot use expresso with summarization disabled (a summary argument has to be provided or it returns an error).  Nor can I use the function pm() because the probe to probeset id numbers are lost.  That is, with pm() the row identifiers turn into probe numbers. but with the function probes() i still get 1007_s_at1, 1007_s_at2, etc.   

Is there a way I can do all the steps that RMA does to the probe level data, without summarization, and with keeping the associated probeset id information to each probe?

I hope this was was clear. If not, I can elaborate. 

ADD COMMENTlink modified 15 months ago by James W. MacDonald46k • written 15 months ago by jxd_8410
gravatar for James W. MacDonald
15 months ago by
United States
James W. MacDonald46k wrote:

It's not clear exactly what you are after; I assume you just want to background correct and then normalize the probes? You can easily do that. As an example:

> dat <- ReadAffy()
> dat

AffyBatch object
size of arrays=712x712 features (18 kb)
cdf=HG-U133A (22283 affyids)
number of samples=6
number of genes=22283

> <- bg.correct(dat, "rma")
> <- normalize(, "quantiles")

Now we have background corrected and normalized data. It's probably easiest to now extract these data into a list:

> prbs <- probes(, LISTRUE = TRUE)

And now we have a list of background corrected, normalized probes, by probeset. These are not log transformed yet, so we could do that as well.

> prbs.log <- lapply(prbs, log2)

And just to check, let's compare the RMA results from just running RMA directly vs what we get from our data:

> eset <- rma(dat)
> head(exprs(eset), 1)
          GSM533844.CEL.gz GSM533845.CEL.gz GSM533846.CEL.gz GSM533847.CEL.gz
1007_s_at         8.764547         9.015684          8.93276         9.319638
          GSM533848.CEL.gz GSM533849.CEL.gz
1007_s_at         9.225833         9.157688
> z <- medpolish(prbs.log[["1007_s_at"]])
1: 26.33469
2: 25.76838
Final: 25.68395
> z$col + z$overall
GSM533844.CEL.gz GSM533845.CEL.gz GSM533846.CEL.gz GSM533847.CEL.gz
        8.764547         9.015684         8.932760         9.319638
GSM533848.CEL.gz GSM533849.CEL.gz
        9.225833         9.157688

Which looks pretty much exactly the same.





ADD COMMENTlink written 15 months ago by James W. MacDonald46k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 199 users visited in the last hour