Hello Bioconductor community,
I have used DESeq2 to test for differences in the overall transcriptional activity of species in a microbiome sample. However, i am not quite sure whether what I'm doing is correct. Thus, I would be very thankful for somebody confirming that this solution makes sense or pointing me to an error.
I am comparing two different biological conditions. For each condition, i have sequencing data from DNA and RNA libraries. This coldata-table describes the data:
I use metagenome assembled genomes representing the microbial species which are present in the sample. To generate count data, i map DNA and RNA reads to these genomes. The DNA and RNA libraries have vastly different sequencing depth and read length, but from what i understand about the DESeq2 normalization I assume that this won't be an issue. So i end up with a count table that looks like this (counts made up):
What i want to test is whether individual species change their transcriptional activity. So something like: "Is species A pumping out more mRNA per individual cell under condition high?". To that end, I'm using the likelihood ratio test.
dds_raw <- DESeqDataSetFromMatrix(countData=count, colData=col, design = ~ condition + type + condition:type) dds <- DESeq(dds_raw, test="LRT", reduced= ~ condition + type) results_lrt <- results(dds)
I then interpret low p-values to mean that the activity of the respective species has changed, since the differences between samples cannot be sufficiently explained without taking into account whether we are looking at genome- or transcriptome-abundance.
Is this the correct way of using DESeq2 to find out what i want to know? Is it even valid to use the package like that? It seems like I have strayed pretty far from the original purpose of the software..
Thank you for any input,