Hi there!
I'm interested in DEG analysis in RNA-seq and I have a question about statistical analysis methods. I tried to analyze DEGs using three different methods as follows:
DEseq2 package
FPKM t-test in excel =T.TEST(ctr1:ctr3,trt1:trt3,2,2)
log2(FPKM) t-test in excel =T.TEST(ctr1:ctr3,trt1:trt3,2,2)
Actually, I'm a bit confused between 2 and 3. I feel like I should do 3 to assume that the data follows a normal distribution. Anyway, what I'm most curious about is whether the results of 1 and 2 or 3 are completely consistent, even if they don't match perfectly.
For example, in my analysis, the results were as follows:
up&down=1500
up=200, down=650
When I tried different datasets, there were cases where the number of up-regulated genes was higher than the number of down-regulated genes in the t-test, while the number of up and down-regulated genes were similar in the DEseq2 results.
My beloved professor is excited to torment me even more with this result. My professor has two main arguments:
Generally, t-tests are less strict in terms of test power than parametric tests, so the number of DEGs in t-tests is expected to be higher than in DEseq2. So, why are there generally more genes identified by DEseq2?
The result of the t-test performed on FPKM data should be similar to DEseq2 results, but the difference is too great.
Now, I need to prepare evidence to refute my professor's arguments or else I need to think more about my analysis. Any opinions are welcome.
For 1) t-tests are parametric, based on the normal distribution. t-tests are expected to yield far fewer DEGs simply because at low sample size they're massively underpowered, please read papers to learn why. That is the basis why methods such as DESeq2 even exist.
RNA-seq is not normally distributed, that is well-known.
For 2) No, there is no basis for such a statement. Again, this has extensively been discussed in the biostats/RNA-seq literature over the last two decades. Apparently your advisor has done little research on that with all due respect, as it is well-accepted today that analysis especially in the presence of low sample size needs specialized methods to moderate (low) counts.
The question on how standard tests (be it t, wilcox, others) compare to specialized methods has been asked many times before, both here and on platforms such as StackExchange and biostars.org, please google for it and refer to benchmarking papers. Same goes for pro/con of normalized counts such as FPKM for testing.
My recommendation would be to not overthink on alternative strategies and simply use what everyone uses for RNA-seq. That could be DESeq2 with raw counts, or alternatives such as edgeR or limma-voom.