Question

Re-use limma beta coefficients

0

Entering edit mode

Keifa ▴ 10

@keifa-11939

Last seen 5.6 years ago

Dear All,

I am analysing a gene expression profiling (Illumina microarray) of 12 samples, 6 of them are organoids and 6 are the matched tumor samples. The gene expression profiles of organoids and tumor samples look different and I identified hundreds of genes differentially expressed (lmfit+eBayes+topTable, adj-p-value < 0.05). The goal of my project is not to identify difference between organoids and tumor samples but to get the two profiles comparable. I was thinking to use removeBathEffect of limma using Organoids and WT as a batch and correct for this know source of variation. The question is: is it possible to re-use the same beta coefficients calculate by the removeBatchEffect function and apply them to a new set of organoids samples profiled with the same platform?

Best,

K

limma removebatcheffect • 1.3k views

ADD COMMENT • link 7.4 years ago Keifa ▴ 10

score 1 · Answer 1 · 2016-11-29

That's an interesting problem. The coefficients represent the log-fold changes between organoids and tumour samples, so if the confidence intervals are small enough (check with topTable), then it would be reasonable to assume that the log-fold changes are reproducible across runs. You could then theoretically apply subtract them from the expression profiles of new organoid samples generated through the same process to obtain rough "tumour-like" expression profiles. However, whether this is accurate enough depends on what you plan to do next. In particular, direct subtraction wouldn't account for the uncertainty with which you estimated the organoid/tumour log-fold change, which I suppose might be fairly large if you're working with patient samples.

It is possible to get over this uncertainty in particular applications, such as in a DE analysis where you have both tumour and organoid samples in a new batch. You could combine your new and old data and test for "differences of differences", i.e., significant differences in the tumour/organoid log-fold changes between the new and old batches. This effectively performs the subtraction while accounting for the uncertainty with which the old log-fold changes are estimated. In other applications like data exploration (e.g., clustering or dimensionality reduction), maybe you don't need to be especially rigorous, in which case direct subtraction might be good enough. But the onus would be on you to test it.

score 0 · Answer 2 · 2016-11-30

0

Entering edit mode

Keifa ▴ 10

@keifa-11939

Last seen 5.6 years ago

Hi Aaron,

Thank you for your answer, I have really appreciated it. I did not catch your point about "differences of differences". Do you mean perform a statistical test for each gene using the two log-fold changes and confidence intervals?

K

ADD COMMENT • link 7.4 years ago Keifa ▴ 10

0

Entering edit mode

Firstly, reply to answers using the "add comment" form, not the "add answer" form, unless you're answering your own question. Secondly, the "differences of differences" test can be done like this:

con <- makeContrasts((Organoids.new - Tumour.new)
                     - (Organoids.old - Tumour.old), levels=design)

... where each term represents the log-expression of the corresponding group. The idea, as the mathematical expression above suggests, is to look for differences in the organoid/tumour log-fold change between the old and new batches. Of course, this is only applicable for DE analyses where you actually have two batches of organoid/tumour samples (and presumably the second batch is biologically different from the first, otherwise the difference of differences wouldn't be particularly interesting).

ADD REPLY • link 7.4 years ago Aaron Lun ★ 28k