Search
Question: DESeq with hierarchical FDR correction
0
3.2 years ago by
United States
liam.philip.shaw0 wrote:

I've been using DESeq2 in Phyloseq  (see this tutorial) for differential abundance testing in a microbiome data set i.e. OTUs, not genes.

The default multiple testing correction applied in the 'results' function is Benjamini-Hochberg. However, OTUs are not independent. There is information available about the specific structure of their relations in the form of a phylogenetic tree.

The package structSSI outlines a hierarchical procedure for multiple testing correction that takes this specific structure of the hypotheses into account.

I'd like to use this hierarchical FDR correction for DESeq results. I understand that running DESeq involves 3 functions.

dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- nbinomWaldTest(dds)

The first two can be run once for the entire dataset (at the OTU level). I then want to be able to fit negative binomial models for the combined abundance of all of the children of each node in the phylogenetic tree somehow using the information from the dispersion fit.

Is there a way to extract the dispersion fit calculated on all the species for use in the model fitting, given that this will be on a new DESeq dataset containing the combined abundances of lots of OTUs?

Thanks,

Liam

modified 3.2 years ago by Michael Love19k • written 3.2 years ago by liam.philip.shaw0
0
3.2 years ago by
Michael Love19k
United States
Michael Love19k wrote:

hi Liam,

Just to note: the BH procedure does not require that the p-values be independent, in this follow-up paper, it is shown to still control the FDR under positive regression dependency of test statistics under the null:

https://projecteuclid.org/euclid.aos/1013699998

For your other question about accessing values, see the vignette section, "Access to all calculated values" on how to extract the different estimates of dispersion.

Hi Michael,

I had read that section of the vignette and accessed the individual estimates of dispersion and the priors. My question about the 'fit' was because I'm still unclear about how the dispersion prior is used in the fitting - very much not a statistician - but I'll keep reading the vignette until it makes sense.

Best,

Liam