Search
Question: DEG with Limma without replicates and best plots to represent results
0
gravatar for Natasha
15 days ago by
Natasha0
Natasha0 wrote:

Hi all,

I have a few questions regarding Limma.

1) I am asking a question with reference to C: limma topTable doesn't work without replicates [was: Help with limma]. This is a case where there are no replicates for samples, and we cannot run eBayes or toptable. I understand that running fit2$coefficients will bring out logFC values for all the samples. However, when I run this, I get a lgFC value for all samples including my control. In such a case, what is the reference sample that we can compare the lgFC values against?

2) I would like to visually represent my top hits for say 20 most up-regulated genes and 20 most down-regulated genes after DEG analysis. Are the best visual representations to show the analysis and the genes a volcano plot? Or are there better plots to show this information?

Any advice would be appreciated.

Thanks!

ADD COMMENTlink modified 15 days ago by Aaron Lun17k • written 15 days ago by Natasha0
2
gravatar for Aaron Lun
15 days ago by
Aaron Lun17k
Cambridge, United Kingdom
Aaron Lun17k wrote:

It's hard to answer your question without an idea of your experimental design or of how you've set up your design matrix. In particular, the interpretation of the coefficients requires some care as these values may not necessarily represent log-fold changes of interest. Consider the following example:

groups <- LETTERS[1:5] # five groups, one sample each.
design <- model.matrix(~0 + groups)
y <- matrix(rnorm(1000), ncol=length(groups)) # log-expression values.
fit <- lmFit(y, design)
head(fit$coefficients)

Here, the coefficients represent the average log-expression of each group - which, in your case, is the same as the log-expression of the corresponding sample, given that you only have one sample in each group. None of the coefficient values represent log-fold changes between groups. Instead, if you want log-fold changes, you need to do some more work:

con <- makeContrasts(groupsA - groupsB, 
                     groupsC - groupsD, 
                     # ... and however many comparisons you want ...
                     levels=design)
fit2 <- contrasts.fit(fit, con)
head(fit2$coefficients)

The story is different if you've made a design matrix without the ~ 0 above, in which case the first coefficient is the intercept (i.e., the log-value of the "first" group) and the other coefficients represent the log-fold change of each other group to the first group. Thus the reference group is the first group, here A. (More generally, the first level of the factor given to model.matrix.)

In all cases, it is critical that you properly understand the meaning of each column of the design matrix before trying to interpret the values of the coefficients. Misinterpretation of the coefficients is a major source of errors in DE analyses.

As for your other question; you can't use a volcano plot here because you can't compute p-values with only one replicate. I would suggest using an MA plot for each pairwise comparison and highlighting the genes of interest. Alternatively you could use a heatmap if you want to show samples from all groups at once (assuming you have more than two groups).

ADD COMMENTlink modified 15 days ago • written 15 days ago by Aaron Lun17k

Hi Aaron,

Thank you so much, this explains a lot, appreciate the help.

Just to clarify, when we make a contrast matrix and in the final output for fit2$coefficients, we get +2 for groupsA-groupsB for Gene A, this would mean that Gene A was down regulated by (lg2) times in groupsB with respect to groupsA right?

And if we had instead done design=(~groups) instead, then I can expect to see, in the final output, for Gene A, under column groupsB, a  value of -2? Which again, would mean that Gene A was down regulated  by (lg2) times in groupsB with respect to groupsA which was coeff=1?

Thanks!

ADD REPLYlink written 15 days ago by Natasha0

A value of 2 for the coefficient of the contrast groupsA - groupsB represents a 4-fold change (log-fold change of 2) in group A over group B. Similarly, a value of -2 for the groupsB coefficient in a design matrix with an intercept represents a downregulation in group B compared to the intercept, i.e., group A.

If you get confused about the direction, you can always check the original expression values to get your bearings.

ADD REPLYlink modified 15 days ago • written 15 days ago by Aaron Lun17k

Hi Aaron,

Thank you, that makes a lot of sense!

ADD REPLYlink written 15 days ago by Natasha0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 193 users visited in the last hour