Question: Re-use limma beta coefficients
0
2.6 years ago by
Keifa10
Keifa10 wrote:

Dear All,

I am analysing a gene expression profiling (Illumina microarray) of 12 samples, 6 of them are organoids and 6 are the matched tumor samples. The gene expression profiles of organoids and tumor samples look different and I identified hundreds of genes differentially expressed (lmfit+eBayes+topTable, adj-p-value < 0.05). The goal of my project is not to identify difference between organoids and tumor samples but to get the two profiles comparable. I was thinking to use removeBathEffect of limma using Organoids and WT as a batch and correct for this know source of variation. The question is: is it possible to re-use the same beta coefficients calculate by the removeBatchEffect function and apply them to a new set of organoids samples profiled with the same platform?

Best,

K

limma removebatcheffect • 484 views
modified 2.6 years ago • written 2.6 years ago by Keifa10
1
2.6 years ago by
Aaron Lun24k
Cambridge, United Kingdom
Aaron Lun24k wrote:

That's an interesting problem. The coefficients represent the log-fold changes between organoids and tumour samples, so if the confidence intervals are small enough (check with topTable), then it would be reasonable to assume that the log-fold changes are reproducible across runs. You could then theoretically apply subtract them from the expression profiles of new organoid samples generated through the same process to obtain rough "tumour-like" expression profiles. However, whether this is accurate enough depends on what you plan to do next. In particular, direct subtraction wouldn't account for the uncertainty with which you estimated the organoid/tumour log-fold change, which I suppose might be fairly large if you're working with patient samples.

It is possible to get over this uncertainty in particular applications, such as in a DE analysis where you have both tumour and organoid samples in a new batch. You could combine your new and old data and test for "differences of differences", i.e., significant differences in the tumour/organoid log-fold changes between the new and old batches. This effectively performs the subtraction while accounting for the uncertainty with which the old log-fold changes are estimated. In other applications like data exploration (e.g., clustering or dimensionality reduction), maybe you don't need to be especially rigorous, in which case direct subtraction might be good enough. But the onus would be on you to test it.

0
2.6 years ago by
Keifa10
Keifa10 wrote:

Hi Aaron,

Thank you for your answer, I have really appreciated it. I did not catch your point about "differences of differences". Do you mean perform a statistical test for each gene using the two log-fold changes and confidence intervals?

con <- makeContrasts((Organoids.new - Tumour.new)
- (Organoids.old - Tumour.old), levels=design)

... where each term represents the log-expression of the corresponding group. The idea, as the mathematical expression above suggests, is to look for differences in the organoid/tumour log-fold change between the old and new batches. Of course, this is only applicable for DE analyses where you actually have two batches of organoid/tumour samples (and presumably the second batch is biologically different from the first, otherwise the difference of differences wouldn't be particularly interesting).