Differential expression error in edgeR - maybe from design matrix?
1
0
Entering edit mode
Alex • 0
@00c53394
Last seen 3.4 years ago
United States

Hi everyone!

I'm using edgeR to analyze a dataset with two treatment groups, each with three replicates. I've followed the canonical glm functionality pipeline. Below is the code thus far. The problem lies in finding differentially expressed genes. According to the 'decideTests' and the MD plot, all the genes are extremely down-regulated, which doesn't seem like a correct output. The MD plot is extremely wonky.

I'm thinking the problem may be in the set-up of the design matrix. Some examples I've seen include a '0+group' and some do not. Not sure which approach would be best here. The output of my 'design' is below. Changing this seems to have drastic impacts on the DE genes.

Any thoughts or advice is appreciated!

library(edgeR)
setwd("myWorkingDirectory")
data <- read.delim("myData")

eyeCondition <- factor(c(rep("normal",3),rep("AMD",3)))
eyeCondition

y <- DGEList(counts = data[,3:8], group = eyeCondition, genes = data[,1:2])

keep <- rowSums(cpm(y) > 0.5) >= 3
table(keep)
y <- y[keep, , keep.lib.sizes = FALSE]

y <- calcNormFactors(y)
y$samples

design <- model.matrix(~0+eyeCondition)
rownames(design) <- colnames(y)
colnames(design) <- levels(eyeCondition)
design

             AMD normal
macula4        0      1
macula8        0      1
macula13       0      1
AMD.macula17   1      0
AMD.macula18   1      0
AMD.macula19   1      0
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$eyeCondition
[1] "contr.treatment"

y <- estimateDisp(y, design, robust = TRUE)
plotBCV(y)

fit <- glmQLFit(y, design, robust = TRUE)
plotQLDisp(fit)

qlf <- glmQLFTest(fit)

topTags(qlf, n = 15)
summary(decideTests(qlf))

# below is troublesome bit

> summary(decideTests(qlf))
       normal
Down    18656
NotSig      0
Up          0


> sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] edgeR_3.32.1 limma_3.46.0

loaded via a namespace (and not attached):
 [1] BiocManager_1.30.15 compiler_4.0.5      tools_4.0.5        
 [4] Rcpp_1.0.6          tinytex_0.31        splines_4.0.5      
 [7] grid_4.0.5          locfit_1.5-9.4      xfun_0.22          
[10] statmod_1.4.36      lattice_0.20-41
design edgeR matrix RNA-seq MDplot • 1.4k views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 10 hours ago
WEHI, Melbourne, Australia

Some examples I've seen include a '0+group' and some do not. Not sure which approach would be best here.

Both approaches give identical results. You can either use ~eyeCondition or you can use ~0+eyeCondition and then define a contrast between the two groups. But you can't use 0+eyeCondition and then omit the contrast. If you do that then the results will simply test one of the group means equal to zero, which is not a useful thing to do.

Section 9.2 of the limma User's Guide explains the two approaches. Or follow Section 3.2.3 of the edgeR User's Guide, which explains how to compare one or more groups. Or even just follow the "Quick Start" guide to edgeR in Section 1.4, which gives a complete analysis pipeline for comparing two groups.

ADD COMMENT
0
Entering edit mode

Appreciate the help!

Would the command coef=2 be required here or is that redundant?

ADD REPLY
0
Entering edit mode

See ?glmQLFTest.

ADD REPLY

Login before adding your answer.

Traffic: 649 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6