I am trying to use edgeR to analyse differential gene expression from my RNA-seq dataset. The experimental design is as follows: I had 3 different treatment conditions i.e. 1 factor with 3 different levels. And there are 3 replicate RNA samples (counts libraries) for each of the 3 treatments i.e. 9 samples in total. I have made a DGEList with the counts for the 9 samples. I have filtered to remove features with metatags and low expression. I have also made a design matrix giving the information that there are 3 groups with 3 samples assigned to each group. The design matrix (called design) looks like this:
Group1 Group2 Group3 Sample1 1 0 0 Sample2 1 0 0 Sample3 1 0 0 Sample4 0 1 0 Sample5 0 1 0 Sample6 0 1 0 Sample7 0 0 1 Sample8 0 0 1 Sample9 0 0 1 I have also estimated dispersion using the following DG <- estimateDisp(DG, design, robust=TRUE)
where DG is my DGEList and design is my design matrix. Because my experimental design is very simple I don't think I need to use a GLM approach to analyse differential expression and I only want to do pair-wise comparisons i.e. find DE between group 1 and group 2 and between group 1 and group 3. So I think I want to use the edgeR classic approach. However, when I try the following code:
> et <- exactTest(DG, pair=c("group1","group2"))
I get the following error:
At least one element of given pair is not a group. Groups are: 1
Can someone please help explain where I am going wrong? I think that my DGEList is not getting the information from my data matrix (design) or perhaps my data matrix is over-complicated for my experimental design and I need to somehow include the group information in my DGEList. At the moment if I print my DGEList the $samples table looks like this:
$samples
files group lib.size norm.factors
counts3D7_1-1 counts3D7_1-1 1 13251744 2.0632398
counts3D7_2-1 counts3D7_2-1 1 13809955 0.9912349
counts3D7_3-1 counts3D7_3-1 1 12328705 1.3071140
counts3D7_1-2 counts3D7_1-2 1 12605616 0.6537380
counts3D7_2-2 counts3D7_2-2 1 13392599 0.6290064
(with not all rows printed) so I think that the correct group information isn't there. Any help is greatly appreciated.

Many thanks for your helpful suggestions. I am now calculating differential expression using the GLM approach. One further question: you say that supplying
designtoestimateDispwill use the GLM approach for estimating the dispersions. Does this mean that I don't need to go back and useqlmQLFitto estimate dispersions prior to calculating differential expression usingglmQLFTest?You still need to use both of them.
estimateDispestimates the negative binomial dispersions, whileglmQLFitestimates the quasi-likelihood dispersions. The book chapter has more details on this:http://link.springer.com/protocol/10.1007/978-1-4939-3578-9_19