Handling heterogeneous data in DESeq2: multiple covariates to control against
I'm trying to account for batch effects and variations in quality for a set of samples that were sequenced from at least 5 different research groups. These were sequenced with different read lengths and in my PCA analysis, I'm seeing that a few seqQC metrics are driving most of the variation in PC1. (Checked the ICC values of my categoricals in this set and they don't seem to be influencing the PCs as much as these quality metrics are). That being said, how can I incorporate 2-3 covariates in the design matrix so the analysis ignores variation based on those variables? I'm sorry for such a basic question, I just started working with DESeq2 a few weeks ago so I'm new to all of this. Thank you for your help!

There is no magic here. It is simple ~cov1+cov2+cov(n)... as design. The only thing you have to make sure is that the variables are not nested with any of your actual experimental variables, but anyway DESeq2 would warn you about this and refuse to accept such a design.