Question

different results in DESeq2 when comparing groups in two ways

0

Entering edit mode

zohreh.fazelan • 0

@07356868

Last seen 2.9 years ago

Canada

I have different treatments (infected and control ones). when using DESeq2, I have different results in two ways of comparisons.

first way: factor: treatment factor level1: moribund infected factor level2: 5week infected factor level 3: Moribund control factor level4: 5week control I have 6 result files (comparisons among factor levels)

second way: (i compared just 2 groups to each other) factor: treatment factor level1: moribund infected

factor level2: 5week infected

factor: treatment factor level1: moribund infected

factor level2: Moribund control

factor: treatment factor level1: moribund infected

factor level2: 5week control

factor: treatment factor level1: 5week infected

factor level2: Moribund control

factor: treatment factor level1: 5week infected

factor level2: 5week control

factor: treatment factor level 1: Moribund control factor level2: 5week control

finally, the results were different in these two ways of comparison. please guide me.

DESeq2 • 1.2k views

ADD COMMENT • link 2.9 years ago zohreh.fazelan • 0

score 1 · Answer 1 · 2022-05-18

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 28 minutes ago

United States

If you fit a model using all your data, you will have more degrees of freedom for estimating variability, which will tend to increase power to detect differences. Which is why, all things equal, people tend to recommend your first way rather than the second.

ADD COMMENT • link 2.9 years ago James W. MacDonald 68k

0

Entering edit mode

Thank you for your response. But I have few comparisons in which there are more significantly expressed genes in the socond way compared to the first way. What about those ones?

ADD REPLY • link 2.9 years ago zohreh.fazelan • 0

0

Entering edit mode

There are no simple answers. In general, with very few observations you take the chance of sampling bias affecting your results. The only reason people get away with the lack of replication in this field is because replicates are expensive. In no other scientific endeavor would you be able to propose and get funding for an experiment with the number of replicates you have. For example, if you were to propose a dietary intervention and you said that there would be three subjects in each group, it would almost surely not get funded because any differences between the groups could just as easily be due to sampling bias rather than any actual differences.

The same is true here, but it's expensive and people generally think of these experiments as hypothesis generating, so it's common. It is actually worse though - in a dietary intervention you are just measuring one thing. Here you are measuring thousands of things, so the likelihood of type I errors creeping in is high. Having more replicates allows you to more accurately estimate the within-group variability, which should help you more accurately detect true differences. Assuming that all the groups have similar expected variability.

My bias would be that the genes that were significant with just the two groups and then not significant with all the groups are likely false positives. But like I said - hypothesis generating - so maybe you are willing to take that chance and are going to do something to validate those results. Or maybe the genes make sense biologically? You are the analyst, and you have to make these decisions and defend them to others, so it's up to you.

ADD REPLY • link 2.9 years ago James W. MacDonald 68k

score 0 · Answer 2 · 2022-05-19

0

Entering edit mode

zohreh.fazelan • 0

@07356868

Last seen 2.9 years ago

Canada

I appreciate your time and response.

ADD COMMENT • link 2.9 years ago zohreh.fazelan • 0