I am currently using DESeq2 to analyze differential expression in some bulk RNA-seq samples, and am running into an issue that I assume is an outlier issue? I essentially have two groups with two conditions (in vitro and invivo, tbx and hcn). I perform DESeq analyses on the in vitro and in vivo samples separately (tbx vs hcn for both). The invivo group is rather small - 5 hcn samples and 4 tbx samples (the in vitro group is relatively larger, with about 40 samples each). For the in vivo sample, I find that I have significantly fewer differentially expressed genes (we'd expect them to be similar). A large number (about 12,000/25,000) of genes are marked with p-value NA, which I presume is because there was an outlier (detected by Cook's distance) throwing the error - and since the sample size is under 7 the trimmed means estimate isn't used (I should mention that this number of NA genes didn't notably change when I used the replaceOutliersWithTrimmedMeans function and minReplicates = 3).
However, even ignoring these, I am finding some strange results. For example, the CXCR4 gene has p-value 0.898 (padj 0.965), but as you can see from the attached boxplot (which plots log-transformed DESeq-scaled values), it appears by eye to be significantly different. There is, however, one major outlier in the in vivo HCN population (both noticed visually and by Cook's distance).
Is there a way to try and account for this in my small sample number dataset without going manually through genes of interest?