Question

limma::avereps for median?

0

Entering edit mode

b.nota ▴ 360

@bnota-7379

Last seen 3.6 years ago

Netherlands

Hi, I am using Agilent single channel data and found the limma::avereps function handy, since the number of replicate probes on the array vary per gene (locus). However, I would prefer to have median in stead of averages. Is there a possibility (argument setting) to use this function to get the median in stead of average? Or is there another function available in limma? I mean I can of course try dplyr or some similar package to calculate median per group (probe set), but I like the fact that with limma::avereps you'll keep the EList format.

Thanks for your help in advance.

limma • 899 views

ADD COMMENT • link updated 4.4 years ago by Gordon Smyth 50k • written 4.4 years ago by b.nota ▴ 360

score 2 · Answer 1 · 2019-11-12

No, avereps does not compute medians. (The function does what it is documented to do, and only that.) There is no equivalent to avereps but with medians, so you would have to do your own programming.

I do not recommend the use of avereps for Agilent data, and I would be even more concerned about using medians. Suppose you have three probes for a gene, only one of which corresponds to an expressed transcript. Then taking medians will discard the expressed transcript as an outlier!

Usually I keep probes separate and I don't recommend consolidating probes into genes routinely. When it is necessary to ensure one probe per gene, I recommend keeping the probe that seems most expressed overall. For an EList y, I would use:

A <- rowMeans(y$E)
o <- order(A, decreasing=TRUE)
y <- y[o,]
d <- duplicated(y$genes$Symbol)
y <- y[!d,]

In the above I have assumed that the gene identifier is called Symbol. If your identifier is different, just substitute the code in the obvious way.