Hi everyone,
I have several soil amplicon-seq metagenomics data sets (16S, 18S, ITS) and I want to see if some communities are significantly differentially present amongst different treatments. When I compare using a condition with 2 factors (Treatment: A, B), I use the following command:
library(DESeq2) dia=phyloseq_to_deseq2(physeq_object, ~Treatment) dia=DESeq(dia, test="Wald", fitType = "parametric", parallel = F) res=results(dia, cooksCutoff = F)
With this, I get a list a Differentially Expressed Communities. If I replace ~Treatment by ~0 + Treatment, I don't get the same results (I have more DEC with the ~0 + Treatment).
Q1. Which one of the approach is more suitable for my type of analysis?
Also, when I have a condition with more than 2 factors (Regie: treatment1, treatment2, treatment3, treatment4), I use the following command:
library(DESeq2) dia=phyloseq_to_deseq2(physeq_object, ~0 + Regie) dia=DESeq(dia, test="Wald", fitType = "parametric", parallel = F) resultsNames(dia) res=results(dia, contrast=c("Regie", "treatment1", "treatment2"), cooksCutoff = F)
I run the last line while I'm looping through other possible pairwise comparisons.
Q2. Is it the right thing to do if I want to compare multiple factors within the same condition? All the examples I've seen with multiple factors were between conditions too (Condition1: A, B, C and Condition2: D, E, F. ex: A vs D, A vs E, etc...) but me it's always in the same condition.
Q3. There are several options with the DESeq() function. Any of them recommended with metagenomics data? (lost of zero and low counts) I tried some of them (ex: test="LRT", reduced= ~ 1, sfType= "poscount") but I don't know which one is better. Any thoughts on that?
Thanks for your help, it's really appreciated!
Cheers,
PY
Thanks for the quick answer!
I'll try to find more info about metagenomics analyses and advices from a statistician.
Have a nice day!