Dear community,
I have a question regarding the p-value adjustments in multiple testing in RNA-seq. I exposed fish from the same species but from different regions (3 sites per region) to nitrite in the lab and want to see different responses between Regions to the exposure to nitrite in the gene expression of liver with RNA-seq. I used 5 - 7 replicates per condition (control - exposed) and Site (110 samples in total), and I think that this experimental design fits the statistical methodology of the 'block' and 'correlation' parameters in the lmFit function in limma.
I filtered low count genes ('filterByExpr'), normalised them ('calcNormFactors' with method "TMM"), and I obtain with lmFit a few p-values (420/13000) lower than 0.05 in a given factor, but when applying the p-value adjustment for multiple testing of B-H all p-values become > 0.99. I am aware that I am probably going short of statistical power and/or the physiological effects are mild because the exposure was to environmentally relevant concentrations of nitrite instead of extreme exposures, but with other biomarkers I did see significant changes in respirometry rates, egg sizes, gene expression in gills in a few genes with RT-qPCR, and blood parameters. Therefore, I think the penalisation of using so many multiple testing in RNA-seq is driving my results. Is there an alternative approach in these cases (e.g. pathway analysis as a whole)? Thank you for your consideration.
```
Thank you Gordon for your thorough reply. It seems when MDS-plotting the obtained object from 'voom', that the points do not cluster together according to the 'treatment' group. When turning on sample weights with voomLmFit the percentage of uncorrected p-values <0.05 improves a little bit, it raises to 5.2% for Region 1 and 7% for Region 2 and 5% for Region 3. The 3.3% value from my previous message was from Region 2. Anyway when applying the B-H procedure all become > 0.05. Therefore, I guess that any sublethal effect in the exposed fish is not being captured by my experimental design.....
plotMDS() is best run on the original DGEList object, i.e., on the input to voom rather than on the output. See the example analyses in the limma documentation and associated workflows.