Question: Limma-voom contrast confusion
gravatar for gregory.l.stone
7 months ago by
gregory.l.stone10 wrote:

I am trying to use limma-voom to determine the effect of a condition between males and females. I have paired samples, and about 70 male and 45 females. At first I organized my data according to section 3.5 of the edgeR users manual ( My data therefore looked as such:

sample condition sex nested
sample1 pre M 1
sample1 post M 1
sample2 pre M 2
sample2 post M 2
sample70 pre F 1
sample 70 post F 1
... ... ... ...

My code looks as follows: 

design = model.matrix(~ sex + sex:nested + sex:condition, data)

colnames(design) <- make.names(colnames(design), unique=TRUE)

v = voomWithQualityWeights(dge, design, plot=FALSE)

fit = lmFit(v, design)

#construct contrast matrix

p <- ncol(fit)
cont <- rep(0, p)
cont[p] <- 1
cont[p-1] <- -1

fit2 =, cont)

fit2 = eBayes(fit2)

results = decideTests(fit2)

> summary(results)
-1     0
0  21457
1      0

Getting 0 sig genes from the contrast seems unlikely because, when I go to see the genes from just the interaction between sex and condition I get the following:

fit_simple = eBayes(fit)
results_simple = decideTests(fit_simple)


#just showing columns of interest

sexMale.conditionpost sexFemale.conditionpost
-1                  4500                    3859
0                   9812                   11541
1                   6819                    5731


Because getting no sig genes seemed suspicious I then tried to incorporate pairing using the duplicateCorrelation function instead of in the design matrix. I also simplified the design, so I did a separate run for male samples, for female samples, and for the interaction between sex and condition and the contrast between the two sexes. My data then looked as such (for just males and females I just had those samples in a table with the patient number starting at 1. Patient is a factor, not numeric):

sample condition sex patient
sample1 pre M 1
sample1 post M 1
sample2 pre M 2
sample2 post M 2
sample70 pre F 70
sample 70 post F 70
... ... ... ...

The code for male samples (and female samples, just different count matrix and data table) looks like this:

dge_Male <- DGEList(countdata_filtered_Male)
dge_Male <- calcNormFactors(dge_Male)
design_Male = model.matrix(~ condition, colData_Male)
v_preCor_Male = voomWithQualityWeights(dge_Male, design_Male, plot=FALSE)
first_cor_Male = duplicateCorrelation(v_preCor_Male, design = design_Male, block = colData_Male$nested)
v_postCor_Male = voomWithQualityWeights(dge_Male, design_Male, block = colData_Male$nested, correlation = first_cor_Male$consensus, plot=FALSE)
second_cor_Male = duplicateCorrelation(v_postCor_Male, design = design_Male, block = colData_Male$nested)
fit_Male <- lmFit(v_postCor_Male, design_Male, block = colData_Male$nested, correlation = second_cor_Male$consensus)
fit_Male <- eBayes(fit_Male)
results_Male <- decideTests(fit_Male)

   (Intercept) conditionpost
-1        6709          3572
0          575          8409
1        11315          6618

   (Intercept) conditionpost
-1        5412          2429
0          929          8720
1         9624          4816


The code for the interaction and contrast is as follows:

dge_pairingCor <- DGEList(countdata_filtered_pairingCor)
dge_pairingCor <- calcNormFactors(dge_pairingCor)

design_pairingCor = model.matrix(~ sex + sex:condition, colData_pairingCor)
colnames(design_pairingCor) <- make.names(colnames(design_pairingCor), unique=TRUE)
v_preCor_pairingCor = voomWithQualityWeights(dge_pairingCor, design_pairingCor, plot=FALSE)
first_cor_pairingCor = duplicateCorrelation(v_preCor_pairingCor, design = design_pairingCor, block = colData_pairingCor$patient)
v_postCor_pairingCor = voomWithQualityWeights(dge_pairingCor, design_pairingCor, block = colData_pairingCor$patient, correlation = first_cor_pairingCor$consensus, plot=FALSE)
second_cor_pairingCor = duplicateCorrelation(v_postCor_pairingCor, design = design_pairingCor, block = colData_pairingCor$patient)
fit_pairingCor <- lmFit(v_postCor_pairingCor, design_pairingCor, block = colData_pairingCor$patient, correlation = second_cor_pairingCor$consensus)

p_pairingCor <- ncol(fit_pairingCor)
cont_pairingCor <- rep(0, p_pairingCor)
cont_pairingCor[p_pairingCor] <- 1
cont_pairingCor[p_pairingCor-1] <- -1

fit_pairingCor2 =, cont_pairingCor)
fit_pairingCor2 = eBayes(fit_pairingCor2)
results_pairingCor = decideTests(fit_pairingCor2)

#results for contrast

-1    45
0  21041
1     45



#results for interaction between sex and condition

fit_pairingCor_simple = eBayes(fit_pairingCor)
results_pairingCor_simple = decideTests(fit_pairingCor_simple)

   X.Intercept. sexFemale sexMale.conditionpost sexFemale.conditionpost
-1         7628       260                  4694                    4022
0           476     20610                  8583                   10274
1         13027       261                  7854                    6835


I am confused about several things: 1) Why do I get different results when taking the pairing out of the design matrix and instead use duplicateCorrelation? 2) Which result should I trust? 3) Why are the results for the interaction term different from when I run just male and female samples? Shouldn't the results be the same?

I am clearly confused as to how best run this type of analysis. I apologize if I did not explain my problem clearly enough, and will do my best to clarify promptly. Any help and guidance would be greatly appreciated! Thank you!

ADD COMMENTlink modified 7 months ago by Aaron Lun16k • written 7 months ago by gregory.l.stone10
gravatar for Aaron Lun
7 months ago by
Aaron Lun16k
Cambridge, United Kingdom
Aaron Lun16k wrote:

Why do I get different results when taking the pairing out of the design matrix and instead use duplicateCorrelation?

Because duplicateCorrelation is somewhat liberal (i.e., detects more false positives than it should). We use it because it enables us to do things that we can't do with standard blocking in a design matrix, e.g., when conditions are confounded with multiple batches. In your case, this is not a problem as you are not comparing expression values directly between patients of different sex. Thus, it is better to block on patient in the design matrix, which avoids making assumptions about the patient effect (e.g., homogeneous, no outliers or systematic differences) and constraining the correlation to a single consensus value across genes.

Which result should I trust?

The first result where you block on patient in the design matrix. If you want some genes for experimental validation, you should increase the FDR threshold (e.g., to 10%) rather than blindly shopping around for an analysis strategy that gives you a "better" result. Then, at least, you and your collaborators know where you stand with regard to the reliability of the gene list. The only extra thing I can suggest is to try turning on robust=TRUE in eBayes to see if outlier variances are distorting the shrinkage statistics.

Incidentally, you say that getting zero genes is unlikely, based on the numbers of genes for each interaction term by itself. I see nothing that supports your claim. You will obviously get fewer DE genes in the female condition because you have fewer samples and less power. Being unable to detect these genes doesn't mean that they're not DE to the same extent as in the male condition.

Why are the results for the interaction term different from when I run just male and female samples? Shouldn't the results be the same?

No, because different numbers of samples are used to estimate the gene-wise variance. When you subset the data, you're only using the male or female samples for estimation in each of the separate analyses. When you analyze all samples together, you get more residual d.f., more reliable variance estimates and thus greater power to detect differential expression.

ADD COMMENTlink modified 7 months ago • written 7 months ago by Aaron Lun16k

Great, thank you very much for the help!

ADD REPLYlink written 7 months ago by gregory.l.stone10

When I run the same analysis in edgeR I get a lot of significantly expressed genes in the contrast. Should I just chalk this up to a difference in the softwares? Or does this indicate that I'm implementing something incorrectly?

ADD REPLYlink written 7 months ago by gregory.l.stone10

Depends on whether you're using glmLRT or the QL framework. The former is also a bit liberal, whereas the latter is more accurate and, in fact, should behave much like voom (without the sample weighting).

ADD REPLYlink modified 7 months ago • written 7 months ago by Aaron Lun16k

I used the glmLRT framework but will try the QL framework. Thank you for the advice!

ADD REPLYlink written 7 months ago by gregory.l.stone10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 107 users visited in the last hour