Question

logFC when counts are zero across all samples in the two groups

1

Entering edit mode

Aditya Bandla ▴ 10

@aditya-bandla-24072

Last seen 3.2 years ago

Singapore/National University of Singap…

My dataset from a microbiome study consists of two factors, one with two levels and the other with three. I am interested in both the fixed effects and the interaction. The first factor is fire with levels burnt and non-burnt. The second factor is depth with levels 0-5, 35-40 and 95-100.

I used the phyloseq wrapper phyloseq_to_deseq2 to convert my phyloseq object to one compatible with DESeq2. I then ran an LRT test

fire_dds <-
  phyloseq_to_deseq2(btp_ps_fire_filt, ~ fire + depth + fire:depth) %>%
  DESeq(., sfType = "poscounts", test = "LRT", reduced = ~fire + depth)

One of the contrasts I was interested in was to look for differences between 35-40 & 0-5 for non-burnt, which I extracted using

resultsNames(fire_dds)
'Intercept' 'fire_non_burnt_vs_burnt' 'depth_35_40_vs_0_5' 'depth_95_100_vs_0_5' 'firenon_burnt.depth35_40' 'firenon_burnt.depth95_100'

results(fire_dds,
  contrast = c(0, 0, 1, 0, 1, 0), 
  test = "Wald"
)

However, when I inspected the results using plotCounts & also by examining the counts matrix accessed using counts(fire_dds, normalised = TRUE, I could see for some of the features which had a non-zero logFC & padj <= 0.05, the counts were infact zero in all the samples across both levels. One such feature was ASV_325. I am confused how to interpret this or what gives rise to such a result. I was expecting an NA to be returned in this case.

I also came across a handful of features which had counts only in one sample across both groups, yet had a padj <= 0.05 with a logFC. Would appreciate any help on how to interpret these results or any filtering that can suppress such features?

The dds object as an rds file is available here https://bit.ly/3jrcGx3

deseq2 • 889 views

ADD COMMENT • link updated 3.7 years ago by Michael Love 41k • written 3.7 years ago by Aditya Bandla ▴ 10

score 3 · Accepted Answer · 2020-08-30

3

Entering edit mode

Michael Love 41k

@mikelove

Last seen 7 hours ago

United States

I've seen this happen in non-RNA-seq datasets when the counts are not plausibly negative binomial, e.g. bimodal within the groups. So then one approach would be to use non-parametric testing such as Wilcoxon-based methods.

But I think another practical approach that I've seen to work in these cases is to use lfcShrink on this comparison and then filter out any features with small shrunken LFC.

ADD COMMENT • link 3.7 years ago Michael Love 41k

0

Entering edit mode

Thanks, Mike. I gave it a go, but the problem still persists. The feature still has a large LFC after shrinkage.

I used lfcShrink(dds = dds, res = res, contrast = contrast, type = "ashr") as type = "normal" wasn't compatible with a design with interactions and I am not sure how to specify coef for a contrast that combines multiple coefficients.

Given these, my current solution is to do post-hoc filtering getting rid of features which are not present in a minimal number of samples in either of the groups. Looks like I might have to take a deeper look at the distributions to see what's going on.

ADD REPLY • link 3.7 years ago Aditya Bandla ▴ 10

1

Entering edit mode

Sorry, I think this would work with apeglm specifically.

ADD REPLY • link 3.7 years ago Michael Love 41k

1

Entering edit mode

Thanks, Mike. I used lfcShrink with apeglm & filtered out those with an absolute log FC <1 and it worked like a charm!

ADD REPLY • link 3.6 years ago Aditya Bandla ▴ 10