I am having trouble determining what values I need to use for determining “significant differentially expressed genes”
I have been asked to attempt this using two methods.
Method 1 involves the use of the pipeline used in our lab which involves mapping to HISAT2 -> normalizing with cuffnorm -> getting FPKM values -> getting the LOG2FC of those and performing a T.Test for P values -> adjusting P values with the BH method -> isolating only those with P-value < .05 and log2fc >2
I feel like selecting change greater than 2 is arbitrary because it doesn’t take in to account that even a subtle change in gene expression can have a great change biological function. Unless the LOG2FC just means its’s been scaled down to where a noticeable difference must be greater than |2| in that case than fine I would make sure they are >|2|
The other method that I was told to use was EdgeR. I input raw read counts into EdgeR -> calcnorm factors -> estimate dispersion -> filter out low expressed genes -> glmFIT -> glmQLFTEST -> and use top tags to see which are DE which I noticed are sorted by FDR value. At the same time I noticed that EdgeR calculates LogFC rather than Log2FC. Well which one am I supposed to use? Wouldn’t the resulting P-values and FDR values be changed because they were calculated using log2FC rather than logFC?
I don’t understand why my lab would use LOG2FC from FPKM to calculate P-values then adjust them and why is that better or worse than EdgeR which uses logFC to calculate p values and FDR.
My question is I guess which of the two methods are better? Should I calculate the LOG2FC from edgeRs LogFC and perform a T.Test and adjust them myself? Will that give me the Q-value?, Adjusted P value, or FDR value or are all of those 3 the same thing?
There is no clear answer anywhere I have read so much documentation my brain hurts and I feel like I’m asking my coworkers the same questions over and over again only to realize they don’t really understand it themselves. Please do not refer me to any links because I have gone through many only to leave me with the same questions. If you don’t want to provide actual input then please don’t bother commenting unless the link truly explains it perfectly but I would much rather someone attempt to explain it in layman's terms. Thank you in advance.