Question

perform probe level RMA without summarization

1

Entering edit mode

jxd_84 ▴ 10

@jxd_84-12120

Last seen 8.1 years ago

I'm trying to perform normalization on a set of Affy U133A chips. I'm using the affy library on R and was able to do the regular probeset level normalization using ReadAffy() and rma() on the affybatch object.

But now I want to do the same thing but for the probe level data. The problem is, I cannot use expresso with summarization disabled (a summary argument has to be provided or it returns an error). Nor can I use the function pm() because the probe to probeset id numbers are lost. That is, with pm() the row identifiers turn into probe numbers. but with the function probes() i still get 1007_s_at1, 1007_s_at2, etc.

Is there a way I can do all the steps that RMA does to the probe level data, without summarization, and with keeping the associated probeset id information to each probe?

I hope this was was clear. If not, I can elaborate.

u133a normalization rma affy • 1.3k views

ADD COMMENT • link updated 8.2 years ago by James W. MacDonald 68k • written 8.2 years ago by jxd_84 ▴ 10

score 1 · Answer 1 · 2017-01-06

It's not clear exactly what you are after; I assume you just want to background correct and then normalize the probes? You can easily do that. As an example:

> dat <- ReadAffy()
> dat

AffyBatch object
size of arrays=712x712 features (18 kb)
cdf=HG-U133A (22283 affyids)
number of samples=6
number of genes=22283
annotation=hgu133a
notes=

> dat.bg <- bg.correct(dat, "rma")
> dat.bg.norm <- normalize(dat.bg, "quantiles")

Now we have background corrected and normalized data. It's probably easiest to now extract these data into a list:

> prbs <- probes(dat.bg.norm, LISTRUE = TRUE)

And now we have a list of background corrected, normalized probes, by probeset. These are not log transformed yet, so we could do that as well.

> prbs.log <- lapply(prbs, log2)

And just to check, let's compare the RMA results from just running RMA directly vs what we get from our data:

> eset <- rma(dat)
> head(exprs(eset), 1)
          GSM533844.CEL.gz GSM533845.CEL.gz GSM533846.CEL.gz GSM533847.CEL.gz
1007_s_at         8.764547         9.015684          8.93276         9.319638
          GSM533848.CEL.gz GSM533849.CEL.gz
1007_s_at         9.225833         9.157688
> z <- medpolish(prbs.log[["1007_s_at"]])
1: 26.33469
2: 25.76838
Final: 25.68395
> z$col + z$overall
GSM533844.CEL.gz GSM533845.CEL.gz GSM533846.CEL.gz GSM533847.CEL.gz
        8.764547         9.015684         8.932760         9.319638
GSM533848.CEL.gz GSM533849.CEL.gz
        9.225833         9.157688

Which looks pretty much exactly the same.