Differential Expression analysis of specific list of genes
1
0
Entering edit mode
JoannaF ▴ 10
@joannaf-9881
Last seen 5 days ago
France

Hello,

I analysed RNA-seq data (432 samples) and I have obtained the statistics about differentially expressed genes thanks to DEseq2 (into results(DEseq(dds)) with dds obtained after using DESeqDataSetFromHTSeqCount).

We are interested about a list of about 60 specific genes among the 19000 protein coding genes studied.

How can I have a differential expression analysis specific to the 60 genes of interest ? Do I have to adjust p-values ?

DESeq2 • 194 views
0
Entering edit mode
swbarnes2 ▴ 800
@swbarnes2-14086
Last seen 1 hour ago
San Diego

Just do things the right way: process all the genes, then subset the results to the 60 genes you care about.

0
Entering edit mode

Thanks for your answer! I don't have to adjust the p-values for the ~60 genes? I have seen on this old post DEseq2 with limited gene set that it is advisable to correct the p-values but is this always the case?

0
Entering edit mode

The post seems to agree with what I said; process all the genes, subset at the very end. I would not alter the adjusted p-value calculation; doing so will make your p-values too good.

0
Entering edit mode

To readjust (or not) the pvalues after the 60 gene subset I think will depend on whether or not these 60 genes were known a priori or if they were found through the analysis of this data.

@JoannaF: If you assembled all these data together in order to only analyze these 60 genes, then I think it is OK to readjust the pvalues after you pull out the results. If you just discovered this 60 gene subset here, then no.

If you are using these 60 genes as some representation of a biological pathway and you want to assess the statistical significance of its activity across your comparison, then you should rather revert to one of the "standard" modes of gene set enrichment / over-representation analysis.

0
Entering edit mode

These 60 genes are known a priori (before the differential gene expression analysis) but they are not the only goal of this analysis.

We have already done Gene Set Enrichment Analysis, what did you mean by "over-representation analysis"?

Thanks again!

0
Entering edit mode

The type of analysis that is done when you use goseq, for instance, falls under the class of analyses I'm calling "over representation analysis"