Hi, Im trying to do an analysis, slightly different than the usual, but I would like to take advantage of the optimal design of DESeq2, so I wonder whether it would be possible to apply it (and how). I have a counts table with columns as different conditions and times points, and "genes" (they are actually guides) as rows. In addition, for each condition+time, I have two fractions; pool (total) and positive-selected (it is a subsample of pools, but it will not have the same distribution). In each of those samples, some of the "genes" are controls, meaning that they will behave in a expected way across conditions and time. What I would like to figure out is how the other "genes" behave respect to these control, in each time point and condition. What I would like to figure out is how the other "genes" behave respect to that control in each time point of each condition, meaning getting a foldChange and padj of the difference to the "normal behavior" in the positive, after correcting by the previous distribution in the pool. Would that be possible to do with DESeq2? I could do it manually but then I wouldn't know how to apply valuable tools as Shrinkage.
I'm reading that you have conditions, time points, and two groups. Some genes will behave in an "expected way" and others different, in a different way per time point and per condition. I tried to follow the part about wanting to compare to behavior of some genes in one group, and also across group.
There's a lot going on here, and it goes beyond the simple software support questions I usually field. I'd recommend collaborating with a statistician. Possibly you can use DESeq2 off the shelf but there seems to be something about learning some behavior across some genes and applying it to other genes which is out of scope. DESeq2 does that just for size factor estimation, not for clustering the genes by biological differences.
Could you treat the control 'genes' as non-differentially expressed genes over time and conditions and normalize the rest of the data with these control genes? This method is called normalization with housekeeping genes.
"Housekeeping genes (HG) are genes that play a role in the basic functions of a cell and so are believed to be non-differentially expressed under the biological conditions of interest. HG normalization assumes that these genes are truly not differentially expressed, and furthermore that they are affected by technical effects the same way as differentially expressed genes. These HG must be identified a priori (which is the case with you)....... Normalization using HG can either equalize the read count of the gene or perform a conventional normalization procedure on a set of HG." (Evans et al., 2018)
So in your case, you could perform normalization on the genes you expect not the change over time with Deseq normalization and thereafter use those normalization coefficients on the counts of the rest of the genes. The differential expression analyses should then give you the genes that do differ over time/conditions.
Hope this helps?