Hi everyone, first time poster.
I have resorted to this because I can't seem to find substantive answers to my question (or don't exactly fit my question), nor can I find much about it in the literature. We have a study where we are interested in finding:
- detecting genes that are differentially expressed in association with a trait (gene discovery)
- From a given list of ~20 genes (a priori/candidate genes that we think will be involved), which are actually associated?
To answer this, we've done bulk RNAseq, with the hope we can answer both questions, but my concern is about aim 2. My plan is to simply use DESeq2 to identify differentially expressed genes, and then extract the results that pertain to our 20 genes, and discuss them as possibly involved based on the uncorrected p-value
However, I am not sure if this is the best approach. For example, should I be taking the normalized or VST values for all 20 genes and running something like an ANOVA or linear regression?
If just subsetting from DESeq2 results, what is the best way to correct for multiple testing among the smaller sample of 20 genes? Should I only accept genes with a bonferroni correction (e.g. 20 hypotheses = genes with a p-value of 0.0025)?
Any other thoughts about other approaches or specific pitfalls of my planned approach is also appreciated.