I am analyzing an experiment where cells go through a process where they are either selected or not and I would like to compare the selected vs. unselected cells vs. a control pool of cells that never went through the process. Below is my study design matrix.
Treatment Control Treated Selected
Unselected Treated 1 0
Selected Treated 1 1
Selected Treated 1 1
Unselected Treated 1 0
Control Control 0 0
Control Control 0 0
Control Control 0 0
Control Control 0 0
The first thing I did was to simply have my design be the Treatment Column like below
dds<-DESeqDataSetFromMatrix(countData = qc_counts,
colData = qc_hash,
design = ~ Treatment)
dds <- estimateSizeFactors(dds)
normcounts <- counts(dds,normalized = T)
dds<-DESeq(dds)
res<-results(dds)
I then realized that I was not properly accounting for the fact that the cells that are treated are different from the control regardless of if they were selected or unselected. The led me to use the formulate ~Treated + Selected to find the difference in selected vs. unselected but also to account for if the cells were treated or not. Below is that code:
dds<-DESeqDataSetFromMatrix(countData = qc_counts,
colData = qc_hash,
design = ~ Treated + Selected)
dds <- estimateSizeFactors(dds)
normcounts <- counts(dds,normalized = T)
dds<-DESeq(dds)
res<-results(dds)
head(res)
I then ran each through lfcShrink to get the L2FC and the adj P values.
testshrunk<-lfcShrink(dds =dds , res=res, coef = 3) ## we want shrunken l2fc ###
test <- results(dds , parallel = F) ### everything else is from un-shrunken ###
test@listData$log2FoldChange <- testshrunk@listData$log2FoldChange
testdf<-as.data.frame(test)
When I did this and plotted to see the difference between the two I realized that the p values were mostly identical with a view that were not. Below is the link to the graph:
I did not expect there to be any of the same p values after account for if the cells have been treated or not. My main question is why there is not a difference in p values and if it is my code or a design problem on my end.
Attached is the link to the graph below. I also plotted old vs new l2fc and they seem to be different so it may just be the p values.
Thanks for your help!
https://imgur.com/a/nSQNYn8
I'm not sure exactly, but there is little effect from the
selected
variable.I wouldn't expect the p-values to line up on y=x, but I don't see a bug in your code above (although you copy-pasted the multifactor twice btw).