Question: Advice on DESeq2 design
0
12 months ago by
pocket.change.j0 wrote:

Hi all,

I know this has been explained numerous times and for various scenarios (some of which I feel are more complicated than what I'm attempting to achieve here), but I struggle to understand how to properly design my analysis..

So what I have is one cell line, infected with either control or shRNA A and B targeting single gene, in the presence or absence of over-expressed gene X at day 2 and day7. So basically:

> colData.d
sample conditionX knockdown   shRNA replicate timepoint
1       1        ctr       ntc     ntc      rep1      day2
2       2        ctr       ntc     ntc      rep2      day2
3       3        ctr       ntc     ntc      rep3      day2
4       4        ctr knockdown shRNA-A      rep1      day2
5       5        ctr knockdown shRNA-A      rep2      day2
6       6        ctr knockdown shRNA-B      rep1      day2
7       7        ctr knockdown shRNA-B      rep2      day2
8       8      geneX       ntc     ntc      rep1      day2
9       9      geneX       ntc     ntc      rep2      day2
10     10      geneX knockdown shRNA-A      rep1      day2
11     11      geneX knockdown shRNA-A      rep2      day2
12     12      geneX knockdown shRNA-B      rep1      day2
13     13      geneX knockdown shRNA-B      rep2      day2
14     14        ctr       ntc     ntc      rep1      day7
15     15        ctr       ntc     ntc      rep2      day7
16     16        ctr knockdown shRNA-A      rep1      day7
17     17        ctr knockdown shRNA-A      rep2      day7
18     18        ctr knockdown shRNA-B      rep1      day7
19     19        ctr knockdown shRNA-B      rep2      day7
20     20      geneX       ntc     ntc      rep1      day7
21     21      geneX       ntc     ntc      rep2      day7
22     22      geneX knockdown shRNA-A      rep1      day7
23     23      geneX knockdown shRNA-A      rep2      day7
24     24      geneX knockdown shRNA-B      rep1      day7
25     25      geneX knockdown shRNA-B      rep2      day7

What I basically would like to know is the effect of knockdown in ctr or geneX cells at different time points. So far I basically created new factor that combines conditionX with timepoint (condition) and analyzed the "knockdown" effect. But I feel it might be better to control for differences between the shRNA A and B on the knockdown effect.. however when I that with e.g. design= ~shRNA+knockdown, I get " model matrix is not full rank" error.. ok, now if I understood correctly I should try treating this as a nested scenario? Is that correct?

Thank you so much for any suggestions on how to approach this..

Jan

deseq2 • 202 views
modified 12 months ago by Michael Love24k • written 12 months ago by pocket.change.j0
0
12 months ago by
Michael Love24k
United States
Michael Love24k wrote:

To clear things up, it helps to remove redundant columns, for example, 'knockdown' gives less information and is linearly dependent with 'shRNA' so it can be removed. Then you are left with condition, shRNA and day. Then you can look at shRNA B vs ntc or A vs ntc and intersect this with day and condition in many ways. If you just want to compute fold changes for all combinations of presence/absence and day2/day7 then the design you suggested is the easiest: simply combining the factors into one.

Is the problem just that you want to take the average of A and B shRNA? You can take averages using a numeric or list style contrast. You can use a design of ~0 + condition, where condition is the combined factor, and then list the resultsNames(dds) that you obtain, and I'll assist with how to structure the average effect.

Thank you so much. Yes, perhaps I'm just sort of confused what's the best approach to extract the DE information..

Bottom line, both my shRNA A and B target the same gene - so I think it's safe to assume that the DE genes that are common for both shRNA A and B will be likely specific effects, as opposed to changes seen e.g. with shRNA-A, but not with shRNA-B, which could indicate potential off-target effect. I thought initially it would be a good idea to treat both as one "knockdown" condition - which would just treat all shRNA conditions as 4 replicates - and extract genes this way.

But then I realized that even in case of specific effects - they can differ between both shRNAs due to, for instance, knockdown level etc, and perhaps it would be a good idea to account for this somehow to increase the sensitivity of DE discovery? Maybe still treating them all as one knockdown condition but with sort of "batch" effect that comes from using two different shRNAs for knockdown? So that's what I was trying to come up with.

Alternatively, as you suggested, I can also do individual shRNAs and pull out common genes.. What do you think would be best?

Thanks again!

I would get separate effects, and then you can have A specific and B specific, by combining the three factors into one new variable called 'condition' and using ~0 + condition. Finally, you can just look at those genes where both have a small adjusted p-value, if you want to find a consistent set. This is perhaps better than averaging if you want to require that both are significant.