I have RNAseq data for human subjects and plan to do a differential gene expression analysis using DESeq2. The covariates I plan to use include sex, age, genotypic principal components, and substance use. However, I have missing data for principal components and substance use and would like to know how to code these missing data in the SampleTable. Can I put "." for missing data or would I have to take out the subjects with missing data? Thanks.
I found this post tends to overlap with the question I'd like to ask, but not entirely. Therefore, adding the question here to this post:
There're some low quality RNA-seq data or some data looking dissimilar in the sample distance analysis from a couple of samples that need removal. The samples were collected before, during and after a treatment. The question is - should the problematic samples be excluded only or the entire subject data to be removed in DESeq2 analysis? Thanks.