Differential expression error in edgeR - maybe from design matrix?
1
0
Entering edit mode
Alex • 0
@00c53394
Last seen 3 months ago
United States

Hi everyone!

I'm using edgeR to analyze a dataset with two treatment groups, each with three replicates. I've followed the canonical glm functionality pipeline. Below is the code thus far. The problem lies in finding differentially expressed genes. According to the 'decideTests' and the MD plot, all the genes are extremely down-regulated, which doesn't seem like a correct output. The MD plot is extremely wonky.

I'm thinking the problem may be in the set-up of the design matrix. Some examples I've seen include a '0+group' and some do not. Not sure which approach would be best here. The output of my 'design' is below. Changing this seems to have drastic impacts on the DE genes.

Any thoughts or advice is appreciated!

library(edgeR)
setwd("myWorkingDirectory")

eyeCondition <- factor(c(rep("normal",3),rep("AMD",3)))
eyeCondition

y <- DGEList(counts = data[,3:8], group = eyeCondition, genes = data[,1:2])

keep <- rowSums(cpm(y) > 0.5) >= 3
table(keep)
y <- y[keep, , keep.lib.sizes = FALSE]

y <- calcNormFactors(y)
y$samples design <- model.matrix(~0+eyeCondition) rownames(design) <- colnames(y) colnames(design) <- levels(eyeCondition) design AMD normal macula4 0 1 macula8 0 1 macula13 0 1 AMD.macula17 1 0 AMD.macula18 1 0 AMD.macula19 1 0 attr(,"assign") [1] 1 1 attr(,"contrasts") attr(,"contrasts")$eyeCondition
[1] "contr.treatment"

y <- estimateDisp(y, design, robust = TRUE)
plotBCV(y)

fit <- glmQLFit(y, design, robust = TRUE)
plotQLDisp(fit)

qlf <- glmQLFTest(fit)

topTags(qlf, n = 15)
summary(decideTests(qlf))

# below is troublesome bit

> summary(decideTests(qlf))
normal
Down    18656
NotSig      0
Up          0

> sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] edgeR_3.32.1 limma_3.46.0

loaded via a namespace (and not attached):
[1] BiocManager_1.30.15 compiler_4.0.5      tools_4.0.5
[4] Rcpp_1.0.6          tinytex_0.31        splines_4.0.5
[7] grid_4.0.5          locfit_1.5-9.4      xfun_0.22
[10] statmod_1.4.36      lattice_0.20-41

design edgeR matrix RNA-seq MDplot • 207 views
2
Entering edit mode
@gordon-smyth
Last seen 6 hours ago
WEHI, Melbourne, Australia

Some examples I've seen include a '0+group' and some do not. Not sure which approach would be best here.

Both approaches give identical results. You can either use ~eyeCondition or you can use ~0+eyeCondition and then define a contrast between the two groups. But you can't use 0+eyeCondition and then omit the contrast. If you do that then the results will simply test one of the group means equal to zero, which is not a useful thing to do.

Section 9.2 of the limma User's Guide explains the two approaches. Or follow Section 3.2.3 of the edgeR User's Guide, which explains how to compare one or more groups. Or even just follow the "Quick Start" guide to edgeR in Section 1.4, which gives a complete analysis pipeline for comparing two groups.

0
Entering edit mode

Appreciate the help!

Would the command coef=2 be required here or is that redundant?

0
Entering edit mode

See ?glmQLFTest.