Question: Differential abundance analysis of 16S data with extremely unbalanced cohorts in Deseq2
gravatar for dr.aj.scott
4 months ago by
dr.aj.scott0 wrote:

For full disclosure I posted this question on the Phyloseq Github page but perhaps this forum is more appropriate.

I have a 16S dataset from gut mucosa and want to analyse differential abundance according to a factor. I have 200 samples: 10 cases and 190 controls.

Q1: is it valid to use DESeq2 to compare differential abundance with DESeq2 with such an a large imbalance between the numbers of cases and control? I know that Deseq2 is designed to deal with some imbalance in sample sizes but I'm unclear about whether this applies equally to 16S data as it does to RNAseq data. There is significant inter-individual variation in 16S data that I'm concerned would prevent

Q2: assuming the above is not a valid way to proceed (i.e. comparing 190 controls with 10 cases), how should this analysis be performed? Should I subsample from my controls (while trying to match other factors between cases and controls)? The problem with this approach is that performing comparisons with different subsamples produces different results (probably because of inherently large intersample variability in 16S data). Also, on what basis would you decide subsample control size? 10, 20, 30?

Q3: A further alternative could be to select x number of controls for comparison to cases but then to resample these controls n number of times and try to build a distribution of n fold changes for each taxa between my cases and controls. Is this statistically valid? How could such an approach be applied with DESeq2?

I'd be grateful for any insight anyone might have on this issue. I have researched the question but have not found it discussed anywhere.

Many thanks for your thoughts.

deseq2 • 68 views
ADD COMMENTlink modified 4 months ago by Michael Love25k • written 4 months ago by dr.aj.scott0
Answer: Differential abundance analysis of 16S data with extremely unbalanced cohorts in
gravatar for Michael Love
4 months ago by
Michael Love25k
United States
Michael Love25k wrote:

DESeq2 can handle imbalanced class size in RNA-seq.

As I’ve said on the forum previously, I’m not familiar with 16S data and I’ve become skeptical that it’s the best tool as the data doesn’t always look similar to RNA-seq. I just haven’t had any time to investigate and it’s not my area of expertise.

ADD COMMENTlink modified 4 months ago • written 4 months ago by Michael Love25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour