Question

RUVseq coefficients for unwanted variance in design matrix

0

Entering edit mode

tonja.r ▴ 80

@tonjar-7565

Last seen 8.1 years ago

United Kingdom

in RUVseq manual they state for the downstream analysis with edgeR to use following design matrix:
design <- model.matrix( ̃x + W_1, data=pData(set1)) , where W_1 are the coefficients of unwanted variance

In edgeR manual batch effect should be the first in the design matrix:
design <- model.matrix(~Batch+Treatment)

Also, when I perform DESeq2 analysis as with the
DESeqDataSetFromMatrix(countData=signal, colData=Design, design=~condition+batch)

I get following header from the results :
log2 fold change (MAP): batch Wald test p-value: batch

If I use instead DESeqDataSetFromMatrix(countData=signal, colData=Design, design=~batch +condition)
I get following and it seems to be appropriate:
log2 fold change (MAP): condition WEN vs WNN Wald test p-value: condition WEN vs WNN

What would be a correct way to include factors of unwanted variance in the deseq and edger design matrix?

ruvseq edger deseq2 • 1.9k views

ADD COMMENT • link 9.0 years ago tonja.r ▴ 80

score 1 · Accepted Answer · 2015-12-07

1

Entering edit mode

Michael Love 42k

@mikelove

Last seen 14 hours ago

United States

In the DESeq2 vignette, we recommend putting 'condition' at the end of the design, so you can use results() without extra arguments (although you can extract any results you like using the arguments)

I'm not sure why it says "condition batch" as you haven't provided the code for how you generated this results table. What code did you use (also sessionInfo)?

ADD COMMENT • link 9.0 years ago Michael Love 42k

0

Entering edit mode

Thank you for the answer. And sorry, it said only "batch" not "condition batch"

ADD REPLY • link 9.0 years ago tonja.r ▴ 80

0

Entering edit mode

Similarly, in edgeR, it doesn't really matter for GLM fitting or dispersion estimation whether batch is put at the start or end. The only thing that will change is the interpretation of the coefficients, and the coef or contrast you need to supply to glmLRT. Keeping batch at the start just makes it convenient as the last coefficient will represent the treatment effect. The last coefficient is dropped by default in glmLRT, so you don't need any extra arguments to do the DE test of interest.

ADD REPLY • link 9.0 years ago Aaron Lun ★ 28k