I am fairly new to DESeq2, though I have used it before it was for experiments with more replicates. I am hoping to get advice on how to deal with DE contrasts for this experiment in which I have 15 groups, 11 groups have 3 biological replicates/samples, but 4 groups are more rare and we could only obtain 2 samples.
I am doing 2 types of contrasts: (i) one group vs rest, (ii) pair-wise group vs group. After I would like to compare the set differences and set intersections of the DE genes across certain groups.
Because I have some groups with only 2 samples, I would like to get advice on how to deal with difference in treatment with the flagging of genes based on Cook's distance which works when comparing groups with 3 samples, but not work in other contrasts with groups with 2 samples. (DESeq2 user guide states: "The results function automatically flags genes which contain a Cook’s distance above a cutoff for samples which have 3 or more replicates. The p values and adjusted p values for these genes are set to NA. At least 3 replicates are required for flagging, as it is difficult to judge which sample might be an outlier with only 2 replicates. This filtering can be turned off with results(dds,cooksCutoff=FALSE).")
This leads to genes with adj p-val NA and ignored for the group contrasts with 3 samples in which count outlier are detected which is ideal, but for the contrasts of groups with only 2 samples, no flagging occurs. I do see genes detected to be DE that have large variance within the group with only 2 samples so this is an issue (I would like to exclude these genes). Because I'm also interested to compare genes that are commonly DE across certain groups, this seems to also be a problem as DE selection is different.
Should I turn-off this Cook's filtering (results(dds,cooksCutoff=FALSE)) for all contrasts and apply my own filter afterwards to maintain consistency? Could you advice how to apply this and how to find a threshold? I don't have experience on this. I had thought to leave Cook's filtering on for the contrasts with groups with 3 samples and look at those outliers as reference but it's limited to genes with outliers in those groups.
I have searched through several previous questions, but have not been able to come to an answer that fits my situation. Please excuse if there is a suitable response that I missed, if you could kindly direct me to that also.