I have different treatments (infected and control ones). when using DESeq2, I have different results in two ways of comparisons.
first way: factor: treatment factor level1: moribund infected factor level2: 5week infected factor level 3: Moribund control factor level4: 5week control I have 6 result files (comparisons among factor levels)
second way: (i compared just 2 groups to each other) factor: treatment factor level1: moribund infected
factor level2: 5week infected
factor: treatment factor level1: moribund infected
factor level2: Moribund control
factor: treatment factor level1: moribund infected
factor level2: 5week control
factor: treatment factor level1: 5week infected
factor level2: Moribund control
factor: treatment factor level1: 5week infected
factor level2: 5week control
factor: treatment factor level 1: Moribund control factor level2: 5week control
finally, the results were different in these two ways of comparison. please guide me.
Thank you for your response. But I have few comparisons in which there are more significantly expressed genes in the socond way compared to the first way. What about those ones?
There are no simple answers. In general, with very few observations you take the chance of sampling bias affecting your results. The only reason people get away with the lack of replication in this field is because replicates are expensive. In no other scientific endeavor would you be able to propose and get funding for an experiment with the number of replicates you have. For example, if you were to propose a dietary intervention and you said that there would be three subjects in each group, it would almost surely not get funded because any differences between the groups could just as easily be due to sampling bias rather than any actual differences.
The same is true here, but it's expensive and people generally think of these experiments as hypothesis generating, so it's common. It is actually worse though - in a dietary intervention you are just measuring one thing. Here you are measuring thousands of things, so the likelihood of type I errors creeping in is high. Having more replicates allows you to more accurately estimate the within-group variability, which should help you more accurately detect true differences. Assuming that all the groups have similar expected variability.
My bias would be that the genes that were significant with just the two groups and then not significant with all the groups are likely false positives. But like I said - hypothesis generating - so maybe you are willing to take that chance and are going to do something to validate those results. Or maybe the genes make sense biologically? You are the analyst, and you have to make these decisions and defend them to others, so it's up to you.