Question

Detection of reference genes by Deseq2

0

Entering edit mode

sj ▴ 10

@7dba58e8

Last seen 20 months ago

Germany

Hello, I´m interested in finding reference genes for qPCR in my RNAseq data. I use therefore the following parameters: altHypothesis="lessAbs",lfcthreshold=0.5, alpha=0.05 to detect the most stable genes: I am not sure if I have understood this correctly: So the alternative and null hypothesis are swapped, i.e. if pval<0.05, the null hypothesis (=difference) is rejected and the alternative hypothesis (difference smaller than 0.5) is accepted.

1) Do I understand it correctly that the genes with the lowest p-value are the most stable genes when I extract my results? 2) When I ask for the summary(res), it says:

out of 67793 with nonzero total read count
adjusted p-value < 0.05
LFC > 0.50 (up)    : 0, 0%
LFC < -0.50 (down) : 621, 0.92%
outliers [1]       : 3331, 4.9%
low counts [2]     : 61080, 90%

I´m wondering, why the summary shows the number of genes greater than my threshold, when I´m more interested in less.

3) In the Beginners guide it says: lessAbs - |β|<x - p values are the maximum of the upper and lower tests<--- what does this mean?

I`m thankful for any suggestions!

lessAbs reference_genes DESeq2 • 977 views

ADD COMMENT • link updated 20 months ago by Michael Love 41k • written 20 months ago by sj ▴ 10

score 0 · Answer 1 · 2022-08-05

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 1 day ago

United States

hi,

Looking at the code, I think summary.DESeqResults doesn't take into account the direction of the test. So you can just get these tables by hand:

with(res, table(sign(log2FoldChange), padj < Y))

The p-values we construct are described in more detail in the 2014 paper if you want to look there. We construct a test that is conservative, but gives error control for the null hypothesis.

Finally, regarding whether these are biologically stable, unfortunately that depends on the normalization (scaling factors). It's possible that despite the way we set up the null hypothesis, if the scaling factors are wrong (due to global shifts in expression), then you wouldn't be capturing the stable features. With global shifts in expression, you need some additional information either from the experiment or from prior knowledge of the system.

ADD COMMENT • link 20 months ago Michael Love 41k

0

Entering edit mode

Hi Michael,

Thanks for your quick reply. What do you mean with "it depends on the normalization",I mean I am using the Deseq2 script which includes normalization for sequencing depth and RNA composition. I used the design~group with the following 6 groups:

1) AA_T0_warm 2) AA_T0_cold 3) SS_T0_warm 4) SS_T0_cold 5) SS_T7_warm 6) SS_T7_cold

I performed DGE analysis already, and detected genes of interest which are differentially expressed. For qPCR validation, I will select some of these genes of interest, but I need to conduct a normalization by using reference genes (most stable genes), and I was thinking to obtain them for every comparison (in total 15) by the following code: e.g. AA_T0_warm VS AA_T0_cold

AA_T0_warm_VS_AA_T0_cold<-results(dds_all,contrast=c("group"," AA_T0_warm","AA_T0_cold"),lfcThreshold = 0.5,altHypothesis = "lessAbs",alpha=0.05)

I would assume that the genes with the lowest log2folchange are the most stable, and that I can use the lists (=15) from all comparisons to find universal reference genes that are stable across all conditions(showing lowest log2foldchange).

Do you think, using altHypothesis = "lessAbs" is made for finding most stable genes?

ADD REPLY • link 20 months ago sj ▴ 10

1

Entering edit mode

To clarify my last paragraph above -- despite our best efforts/methods, it is possible that we cannot determine from the counts alone what is "stable" and what is changing, without any additional evidence (either spike-in sequences or biological prior knowledge). So the question is not, what options/arguments to use, but more, is there a global shift in expression in my cells and what would I look for to define stable RNA levels if that were the case.

ADD REPLY • link 20 months ago Michael Love 41k

0

Entering edit mode

Long story short, use plotMA() to see how normalization went. If the plot is somewhat symmetric and/or the majority of point center along y=0 which is typically the case then probably your lessAbs genes could be used for your purpose.