Question: DEG with Limma without replicates and best plots to represent results
gravatar for Nithisha
12 months ago by
Nithisha10 wrote:

Hi all,

I have a few questions regarding Limma.

1) I am asking a question with reference to C: limma topTable doesn't work without replicates [was: Help with limma]. This is a case where there are no replicates for samples, and we cannot run eBayes or toptable. I understand that running fit2$coefficients will bring out logFC values for all the samples. However, when I run this, I get a lgFC value for all samples including my control. In such a case, what is the reference sample that we can compare the lgFC values against?

2) I would like to visually represent my top hits for say 20 most up-regulated genes and 20 most down-regulated genes after DEG analysis. Are the best visual representations to show the analysis and the genes a volcano plot? Or are there better plots to show this information?

Any advice would be appreciated.


ADD COMMENTlink modified 12 months ago by Aaron Lun21k • written 12 months ago by Nithisha10
gravatar for Aaron Lun
12 months ago by
Aaron Lun21k
Cambridge, United Kingdom
Aaron Lun21k wrote:

It's hard to answer your question without an idea of your experimental design or of how you've set up your design matrix. In particular, the interpretation of the coefficients requires some care as these values may not necessarily represent log-fold changes of interest. Consider the following example:

groups <- LETTERS[1:5] # five groups, one sample each.
design <- model.matrix(~0 + groups)
y <- matrix(rnorm(1000), ncol=length(groups)) # log-expression values.
fit <- lmFit(y, design)

Here, the coefficients represent the average log-expression of each group - which, in your case, is the same as the log-expression of the corresponding sample, given that you only have one sample in each group. None of the coefficient values represent log-fold changes between groups. Instead, if you want log-fold changes, you need to do some more work:

con <- makeContrasts(groupsA - groupsB, 
                     groupsC - groupsD, 
                     # ... and however many comparisons you want ...
fit2 <-, con)

The story is different if you've made a design matrix without the ~ 0 above, in which case the first coefficient is the intercept (i.e., the log-value of the "first" group) and the other coefficients represent the log-fold change of each other group to the first group. Thus the reference group is the first group, here A. (More generally, the first level of the factor given to model.matrix.)

In all cases, it is critical that you properly understand the meaning of each column of the design matrix before trying to interpret the values of the coefficients. Misinterpretation of the coefficients is a major source of errors in DE analyses.

As for your other question; you can't use a volcano plot here because you can't compute p-values with only one replicate. I would suggest using an MA plot for each pairwise comparison and highlighting the genes of interest. Alternatively you could use a heatmap if you want to show samples from all groups at once (assuming you have more than two groups).

ADD COMMENTlink modified 12 months ago • written 12 months ago by Aaron Lun21k

Hi Aaron,

Thank you so much, this explains a lot, appreciate the help.

Just to clarify, when we make a contrast matrix and in the final output for fit2$coefficients, we get +2 for groupsA-groupsB for Gene A, this would mean that Gene A was down regulated by (lg2) times in groupsB with respect to groupsA right?

And if we had instead done design=(~groups) instead, then I can expect to see, in the final output, for Gene A, under column groupsB, a  value of -2? Which again, would mean that Gene A was down regulated  by (lg2) times in groupsB with respect to groupsA which was coeff=1?


ADD REPLYlink written 12 months ago by Nithisha10

A value of 2 for the coefficient of the contrast groupsA - groupsB represents a 4-fold change (log-fold change of 2) in group A over group B. Similarly, a value of -2 for the groupsB coefficient in a design matrix with an intercept represents a downregulation in group B compared to the intercept, i.e., group A.

If you get confused about the direction, you can always check the original expression values to get your bearings.

ADD REPLYlink modified 12 months ago • written 12 months ago by Aaron Lun21k

Hi Aaron,

Thank you, that makes a lot of sense!

ADD REPLYlink written 12 months ago by Nithisha10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 327 users visited in the last hour