Hi,
I have a paired study design similar to section 3.5 in edgeR manual - 25 subjects including with disease (D) and controls (C), expression montinored before drug (B) and after drug (A). I also wish to control for Batch (1/2) and other medication (0/1). The data looks something like this:
subject Disease treatment Batch other_medication
1 D B 1 0
1 D A 1 0
2 D B 1 1
2 D A 1 1
3 D B 2 0
3 D A 2 0
4 D B 1 1
4 D A 1 1
5 D B 1 1
5 D A 1 0
6 D B 2 0
6 D A 2 1
7 D B 1 1
8 D B 1 1
8 D A 1 0
9 D B 2 1
9 D A 2 1
10 D B 1 1
10 D A 2 1
11 D B 1 1
11 D A 1 1
12 D B 1 1
13 D A 1 0
14 D A 1 1
15 D A 1 1
16 D A 1 0
17 D A 1 1
18 D A 1 1
19 D B 2 1
19 D A 2 1
20 D B 2 1
20 D A 2 1
21 C B 2 0
21 C A 2 0
22 C B 1 0
22 C A 1 0
23 C B 2 0
23 C A 2 0
24 C B 1 0
24 C A 1 0
25 C B 2 0
25 C A 2 0
One of the camparisons I wish to do is to find DE genes between disease subjects and controls before taking drug. The dataset is unbalanced, but for just disease-control comparison @ baseline I guess it should not matter. I used both edgeR and limma-voom, but they gave very different results. I would really appreciate any help.
Voom:
Combined Disease and treatment into single variable combo:
combo <- factor(paste(Disease,Treatment,sep="."))
And used the following design:
design <- model.matrix(~0 + combo + Batch + other_medication)
And then block by subject using duplicate correlation:
v <- voom(rawcount,design, lib.size = colSums(rawcount)*y$samples$norm.factors) corfit <- duplicateCorrelation(v, design, block=subject) v <- voom(rawcount,design,lib.size = colSums(rawcount)*y$samples$norm.factors, block = subject,correlation=corfit$consensus) fit <- lmFit(v,design,block = subject, correlation = corfit$consensus) cm <- makeContrasts(DvsC.B = D.B - C.B, levels=design) fit2 <- contrasts.fit(fit, cm) fit2 <- eBayes(fit2) top <- topTable(fit2, adjust="BH",n=1000000)
I get 4000 DE genes with FDR<0.05.
edgeR:
I used a subset of data with disease and control samples at before drug timepoint. And used the fowling design:
design <- model.matrix(~batch + other_medication + disease) y <- DGEList(counts=rawcount, genes=row.names(rawcount)) y <- calcNormFactors(y) y <- estimateGLMCommonDisp(y,design, verbose=TRUE) y <- estimateGLMTrendedDisp(y,design) y <- estimateGLMTagwiseDisp(y,design) fit <- glmFit(y,design) lrt <- glmLRT(fit,coef=4)
But this did not give me any DE genes.
Are these two approaches comparable? And which is more reliable? Thank you for the help!
AD
My point was that you don't need to intervene in this way. voom uses edgeR's normalization factors automatically, as the help page explains.