Hi, I am doing a differential expression analysis of small RNA using edgeR. I have 4 normal and 4 diseased samples, all samples are paired. Now, I have very little knowledge of statistics so I would appreciate clarification on the following:
When I do the analysis without any per-filtering of low abundant genes, I get, let's say about 10-12 values in the FDR column with FDR < 0.05. Now when I impose a filtering criterion, say for example that, the cpm value of at least 4 of the samples should be greater than 1, I am getting only 1 FDR < 0.05.
However, the top genes are more or less same in both cases. In case two, my results look better, when viewing the cpm values of the samples side by side, as most of the low abundant genes have been filtered out.
So, my question is, on reducing the data set, is it so that the FDR values also increase? Do the p-values and FDR values have a bearing on the samples size?