In one of the RNA-Seq datasets I'm analysing, the knockdown/overexpression of a single gene is being compared to the empty vector. This comparison has been done for seven different genes. In every case, I'm getting over 5000 significantly differentially expressed genes. To me this seems like an excessive amount given that only one gene has been overexpressed/knocked down and that it's happening for all seven genes, so probably I'm doing something wrong. Does anyone have any advice/ideas on how I can find out how correct the results are or how to pinpoint where I went wrong?
The pipeline trims the reads with fastp, aligns them with STAR (I've also tried Salmon with the same result) and tests for differential expression with DESeq2.