Question

ANOVA-like testing for bulkRNAseq with three conditions with Wald test

0

Entering edit mode

thomas.heigl.ibk • 0

@adde7aac

Last seen 16 days ago

Switzerland

I am new to the Bioinformatics field and am trying to gain as much knowledge as possible in short amounts of time so I can tackle the analysis part of data in our labs.

I started to read at first into RNAseq, since we have bulk RNAseq data of sperms to analyse between three different conditions. wildtype (WT), heterozygous (HZ) and mutant (MT) - each having three biological replicates. For two conditions, such as WT vs MT, the analysis is straightforward employing DESeq2. It gets puzzling when using three conditions, since I don't know if in the background the Wald Test is considering all three conditions in the designated column of the design file. When I look at the levels, I see that three levels are found (WT, HZ and MT) and that WT is on the first position, because I refactored it that way. Nevertheless, I am not sure what exactly is happening under the hood. When I look at the resultsNames() I get ""Intercept" "condition_hz_vs_wt" "condition_rd_vs_wt"

So, what has happened when calling DESeq()? Did the Wald Test include all nine samples for its calculations? Did Wald test only calculate HZ against WT and MT against WT comparisons seperately?
In theory, if I separate samples input counts in WT-MT and WT-HZ pairs and do the whole procedure in separate files and sessions (two different design files), I should get the same results as with the first approach described above, with all the samples/conditions combined (L2FC, p-value, adj-p-value, etc), am i right?
if the prior point is true, I would use three different parallel approaches to compare WT-HZ, WT-MT and HZ-MT. Since an ANOVA-like approach seems strange here and I don't understand how three conditions could be used in such an approach.

Thanks in advance. Trying to get a good grasp on these things, however, it seems that sometimes the descriptions online confuse me even more.

Cheers

DESeq2 • 463 views

ADD COMMENT • link updated 11 months ago by ATpoint ★ 4.0k • written 12 months ago by thomas.heigl.ibk • 0

0

Entering edit mode

Yes, all samples are considered. You can use contrasts to get the pairwise results you want. Calling two groups separately will likely give similar but not identical results as normalization and estimation of model parameters will be slightly different, the vignette has a section about that. You can compare all three levels to each other in a single analysis, see vignette on contrasts and on the question when or when not to split groups.

ADD REPLY • link 11 months ago ATpoint ★ 4.0k

score 0 · Answer 1 · 2023-05-01

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 9 hours ago

United States

For questions about setting up the statistical design for your study, I recommend speaking with a local statistician or someone familiar with linear models in R. I have to restrict my time on the support site for software related questions.

ADD COMMENT • link 11 months ago Michael Love 41k