Question

Handling biological replicates when analyzing differential abundance of OTUs with DESeq2

0

Entering edit mode

jport ▴ 10

@jport-7094

Last seen 9.4 years ago

United States

I had a question regarding analysis of differential abundance of OTUs using DESeq2. As an example, I have 3 samples with 3 replicates each, and want to compare whether selected OTUs are significantly different across all habitats (similar to ANOVA). The count table below shows the setup. With the DESeq function (design=~habitat; where sample 1 all replicates= habitat1, sample 2 all replicates=habitat2, sample3 all replicates=habitat3) I am able to perform pairwise comparisons across the habitats (e.g. habitat1 vs. habitat2, habitat1 vs. habitat3, etc.), and with the LRT parameter (reduced=~1) can perform a global comparison across all habitats. But I assume this LRT treats every replicate separately and performs the test across all replicates of all samples in the three habitats (sample1_rep1 vs. sample1_rep2 vs. sample1_rep3 vs. sample2_rep1...vs. sample3rep3). But is there a way to group the replicates in some way first before running the LRT so that the test is not by replicate but is by habitat with grouped replicates? These are biological replicates so would prefer not to collapse by summing the counts as is recommended for technical replicates.

OTU1 OTU2 OTU3

Sample1_rep1 5200 32508 1890

Sample1_rep2 356 52541 0

Sample1_rep3 2453 28167 3814

Sample2_rep1 11897 31699 0

Sample2_rep2 4690 49127 0

Sample2_rep3 4950 47731 0

Sample3_rep1 3925 6182 513

Sample3_rep2 3148 9783 362

Sample3_rep3 1241 6166 0

Thanks much,

Jesse Port

deseq2 • 1.7k views

ADD COMMENT • link updated 9.4 years ago by Michael Love 41k • written 9.4 years ago by jport ▴ 10

score 0 · Answer 1 · 2014-12-08

"But I assume this LRT treats every replicate separately and performs the test across all replicates of all samples in the three habitats (sample1_rep1 vs. sample1_rep2 vs. sample1_rep3 vs. sample2_rep1...vs. sample3rep3). But is there a way to group the replicates in some way first before running the LRT so that the test is not by replicate but is by habitat with grouped replicates?"

hi Jesse,

I think you're just off on the assumption of what's happening. A likelihood ratio test with full design ~ habitat and reduced design ~ 1 does group the samples by habitat. The meaning of the test is: is the increase in the likelihood of the model which groups the samples by habitat significantly more than the likelihood from the model which only fits an intercept term. The wikipedia page on LRT might offer some useful background: http://en.wikipedia.org/wiki/Likelihood-ratio_test