Hello,
I have a simple RNA-seq experiment with treatment and control, each with 3 biological repeats. I run my data through edgeR and obtained differentially expressed genes (DEGs). Due to the low sample number and small effect size, there are likely more genes affected by the treatment that didn't meet the cutoff. For that reason, I want to try extracting more information from my data using a permutation test. I resampled my data by shuffling the columns and generated 1000 permuted dataframes. Next, I run each dataframe through my edgeR pipeline, which produced results such as RowName, logFC, logCPM, LR, and PValue. My qustion is, Which value (for example, logFC or PValue) do I take from each permuted dataframe to generate the distribution for each gene and calculate the p-value? Also, what is the p-value calculation?
Thank you!
While James clearly explains the point (the most concise example I ever read), a reference to such a combinatorial analysis is "Significance analysis of microarrays applied to the ionizing radiation response" Tusher et al., 2001 IMHO.