I am using DESeq2 to test for differential expression of miRNA reads. The 10 control samples are about 5 to 60 million reads. The 5 treatment samples are about 1 million to 5 million reads. The normalization run by DESeq2 adjusts these by creating size factors to scale the counts. The overabundance of miRNAs that are found to be down-regulated (LFC <0) (with a padj < 0.05) suggest there simply are not enough reads in the treatment input for the more rare miRNAs. To address this problem I have done two things: (1) I restrict the control samples to those that are 10m million reads or fewer leaving 7 control samples, but the ratio of the median read depths for the two groups is still about 4 to 1, and, (2) I am filtering out the miRNAs that are not expressed adequately in the treatments, although this could preclude finding miRNAs that were severely down-regulated by treatment. Alternatives include subsampling the controls to create read counts that are comparable to those of the treatments, but I have seen posts that indicate this is not a good approach (not sure I understand why). Can anyone suggest a better approach to this? It may well be that more rare miRNAs are not that important, but I like to be thorough.
Thanks, Bill