phyloseq_to_deseq2 design strategy for multiple factors analysis (2 or more)
1
0
Entering edit mode
@pyveronneau71-23108
Last seen 4.9 years ago

Hi everyone,

I have several soil amplicon-seq metagenomics data sets (16S, 18S, ITS) and I want to see if some communities are significantly differentially present amongst different treatments. When I compare using a condition with 2 factors (Treatment: A, B), I use the following command:

library(DESeq2) dia=phyloseq_to_deseq2(physeq_object, ~Treatment) dia=DESeq(dia, test="Wald", fitType = "parametric", parallel = F) res=results(dia, cooksCutoff = F)

With this, I get a list a Differentially Expressed Communities. If I replace ~Treatment by ~0 + Treatment, I don't get the same results (I have more DEC with the ~0 + Treatment).

Q1. Which one of the approach is more suitable for my type of analysis?

Also, when I have a condition with more than 2 factors (Regie: treatment1, treatment2, treatment3, treatment4), I use the following command:

library(DESeq2) dia=phyloseq_to_deseq2(physeq_object, ~0 + Regie) dia=DESeq(dia, test="Wald", fitType = "parametric", parallel = F) resultsNames(dia) res=results(dia, contrast=c("Regie", "treatment1", "treatment2"), cooksCutoff = F)

I run the last line while I'm looping through other possible pairwise comparisons.

Q2. Is it the right thing to do if I want to compare multiple factors within the same condition? All the examples I've seen with multiple factors were between conditions too (Condition1: A, B, C and Condition2: D, E, F. ex: A vs D, A vs E, etc...) but me it's always in the same condition.

Q3. There are several options with the DESeq() function. Any of them recommended with metagenomics data? (lost of zero and low counts) I tried some of them (ex: test="LRT", reduced= ~ 1, sfType= "poscount") but I don't know which one is better. Any thoughts on that?

Thanks for your help, it's really appreciated!

Cheers,

PY

deseq2 • 860 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 6 days ago
United States

I don't have much feedback for DESeq2 for microbiome / metagenomic analyses. I'm not convinced it is always a good model for these datatypes and have done no development of the software to support microbiome / metagenomics. So I do not have any recommendations in particular. To the degree it is similar to certain single cell datasets (which I have profiled in collaboration with the zingerR and ZINB-WaVE authors), we found LRT and poscounts were good options in the presence of data more compatible with zero inflated NB distributions.

Re: replacing ~treatment with ~0 + treatment, these are different model matrices with different interpretations of the coefficients.

You need to discuss all modeling choices with a statistician if you are unsure of the meaning of the coefficients (this is outside the scope of support I can provide).

ADD COMMENT
0
Entering edit mode

Thanks for the quick answer!

I'll try to find more info about metagenomics analyses and advices from a statistician.

Have a nice day!

ADD REPLY

Login before adding your answer.

Traffic: 1023 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6