Question

DESeq2 Identical P values in both standard and multi factor design

0

Entering edit mode

dtlloyd • 0

@dtlloyd-22032

Last seen 4.6 years ago

I am analyzing an experiment where cells go through a process where they are either selected or not and I would like to compare the selected vs. unselected cells vs. a control pool of cells that never went through the process. Below is my study design matrix.

Treatment       Control    Treated    Selected

Unselected      Treated    1               0
Selected        Treated    1               1
Selected        Treated    1               1
Unselected      Treated    1               0
Control         Control    0               0
Control         Control    0               0
Control         Control    0               0
Control         Control    0               0

The first thing I did was to simply have my design be the Treatment Column like below

dds<-DESeqDataSetFromMatrix(countData = qc_counts,
                              colData = qc_hash,
                              design = ~ Treatment)
dds <- estimateSizeFactors(dds)

normcounts <- counts(dds,normalized = T)
dds<-DESeq(dds)
res<-results(dds)

I then realized that I was not properly accounting for the fact that the cells that are treated are different from the control regardless of if they were selected or unselected. The led me to use the formulate ~Treated + Selected to find the difference in selected vs. unselected but also to account for if the cells were treated or not. Below is that code:

dds<-DESeqDataSetFromMatrix(countData = qc_counts,
                              colData = qc_hash,
                              design = ~ Treated + Selected)

dds <- estimateSizeFactors(dds)
normcounts <- counts(dds,normalized = T)
dds<-DESeq(dds)
res<-results(dds) 
head(res)

I then ran each through lfcShrink to get the L2FC and the adj P values.

testshrunk<-lfcShrink(dds =dds , res=res, coef = 3) ## we want shrunken l2fc ###
test <- results(dds , parallel = F)  ### everything else is from un-shrunken ###
test@listData$log2FoldChange <- testshrunk@listData$log2FoldChange
testdf<-as.data.frame(test)

When I did this and plotted to see the difference between the two I realized that the p values were mostly identical with a view that were not. Below is the link to the graph:

https://imgur.com/a/nSQNYn8

I did not expect there to be any of the same p values after account for if the cells have been treated or not. My main question is why there is not a difference in p values and if it is my code or a design problem on my end.

deseq2 r • 420 views

ADD COMMENT • link 4.6 years ago dtlloyd • 0

score 0 · Answer 1 · 2019-10-01

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 5 hours ago

United States

Can you plot the coefficients associated with selected?

E.g.:

res.sel <- results(dds, contrast=c("selected","1","0"))
with(res.sel, plot(log2FoldChange, -log10(pvalue)))

ADD COMMENT • link 4.6 years ago Michael Love 41k

0

Entering edit mode

Attached is the link to the graph below. I also plotted old vs new l2fc and they seem to be different so it may just be the p values.

Thanks for your help!

https://imgur.com/a/nSQNYn8

ADD REPLY • link 4.6 years ago dtlloyd • 0

0

Entering edit mode

I'm not sure exactly, but there is little effect from the selected variable.

I wouldn't expect the p-values to line up on y=x, but I don't see a bug in your code above (although you copy-pasted the multifactor twice btw).

ADD REPLY • link 4.6 years ago Michael Love 41k