Hi!
Quick question about the type of count data DESeq2 expects as input: Should I avoid doing any clean-up prior to normalizing my data with DESeq2? Currently, I remove contaminant OTUs (those that were present in the extraction kit or PCR reagents) from my OTU table. I am also considering removing OTUs with very few total reads across the whole dataset, because page 42 of the vignette states "Users might consider first removing genes with very few reads, e.g. genes with row sum of 1, as this will speed up the fitting procedure."
However, page 4 states: "The count values must be raw counts of sequencing reads. This is important for DESeq2’s statistical model to hold, as only the actual counts allow assessing the measurement precision correctly."
Is it ok to remove contaminants and/or sparsely sampled OTUs before importing my data into DESeq2? I'm not sure whether my counts are still considered "raw counts of sequencing reads" after I have performed these basic cleanup steps.
Thanks!