Hello,
I have seen examples of design formulas that have the replicate factor and others that do not use it. I was wondering when one should put the replicate factor in the design formula and when not.
For example, I have a dataset that has independent inoculations of whole plants in the greenhouse. We expected the replicates to be variable and the PCA showed it (of course they are less variable than the effect of the inoculation). So I tested with and without the replicate in the formula and without the replicate, I cannot get to see any interesting DEGs but if I put the replicate factor in the design formula, I see the genes that are expect to move.
Then, we have another data where the biological replicates are just branches so they do not vary that much (which was observed in the PCA) and in that case adding the replicate factor did not bring further information.
So what would it mean to put the replicate factor in the design ? Is it recommended to do it when the replicates are very variable ?
Thank you for you enlightenments !
Thank you for your answer, it really helped me to understand !
Indeed, I did not realize that my samples might be paired and therefore adding the replicate in the design formula would help to account for the variability between replicates. It would make sense then to account for that variability especially in the case of real biological replicates done in semi-controlled inoculations by putting the replication factor in the design formula like it is normally done for "paired" samples.