Hi all,
I'm currently working with 16S data from an experiment involving two different strains of mice, at 4 different time points between P17 and P84. I'm attempting to analyze the differentially abundant taxa between genotypes at each postnatal age sampled. In addition, I'm hoping to make use of some previously acquired metabolite data to extract some differentially abundant taxa using SCFA levels as a continuous predictor variable. I have a small sample size due to the pilot nature of the study, amounting to 12 fecal and 12 cecal samples for each sampling group. I've worked through my data using both ALDEx2 and DESeq pipelines, although I am uncertain that my analyses are optimized for the experimental questions I want to answer.
My metabolite data is derived from NMR spectroscopy, so all of the values are relative intensities ranging from 0-1. I'm concerned that this is confounding my results, given the log fold change values that DESeq outputs are per unit of a continuous predictor.
I've provided an example of my code for the fecal samples from the P28 timepoint below, looking at differential taxa in relation to butyrate levels. This has been repeated for all timepoints independently after subsetting data by age.
#dat_pr is an un-normalized sequence count table of ASVs for each sample
dat_pr_fecal_28_ap = subset_samples(dat_pr_fecal_clean_met, Age == 28)
dds_fecal_28 = phyloseq_to_deseq2(dat_pr_fecal_28_ap, ~butyrate_levels)
dds_fecal_28 = DESeq(dds_fecal_28)
res_fecal_28 = results(dds_fecal_28, name = 'butyrate_levels', independentFiltering = FALSE)
res_fecal_28
res_df_fecal_28 = data.frame(res_fecal_28)
res_df_fecal_28 = (res_df_fecal_28
%>% rownames_to_column('ASV'))
head(res_df_fecal_28)
res_df_fecal_28
``
` The output of this code not only returns no significantly differentially abundant taxa (no adjusted p-values < 0.01), but the volcano plot reveals some odd behavior of the -log p-values that seem to plateau out at a certain ceiling below this threshold (see here)
Please let me know if my model is constructed correctly, and if there is anything I'm missing that may be impinging upon my results!
Thanks so much in advance :)
Cross-posted: https://www.biostars.org/p/462018/