Question: Order of design matrix for edgeR analysis
lornaas0 wrote:

I'm running an edgeR analysis on RNAseq data that was run in two seperate batches which I want to control for. Should the variable I'm interested in be first or last in the equation?

ie "design <- model.matrix(~group + batch)"

or

"design <- model.matrix(~batch + group)"?

The edgeR manual says the group should go last, similar to DESEeq2, but I have been told by a collegue that it should go first?

The order makes no difference. Either order will give exactly the same results in edgeR.

What have you read in the edgeR User's Guide that makes you think that group should go last? The User's Guide does not actually say that.

Answer: Order of design matrix for edgeR analysis
James W. MacDonald wrote:

The order of the covariates can affect the interpretation of the coefficients, but otherwise it doesn't matter. Unless of course you aren't sure what the coefficients mean (usually not an issue with a simple model like this). Also, by default the last coefficient is the one that is tested by glmLRT or glmQLFTest, but setting up your design matrix to be able to use the default is not something I would normally bother with. It's easy enough to say what coefficient you want to drop, so just do that.