Hi,
I was wondering if someone could provide some advice on the best way to approach my differential expression problem, or if it is even possible to do so.
We generated RNA-seq data from a series of gene knockdowns in an insect cell line:
Sample | Rep | Library | Batch |
Ctrl | 1 | Ctrl_1 | A |
Ctrl | 2 | Ctrl_2 | A |
Target_A | 1 | Target_A_1 | A |
Target_A | 2 | Target_A_2 | A |
Target_B | 1 | Target_B_1 | B |
Target_B | 2 | Target_B_2 | B |
Target_C | 1 | Target_C_1 | B |
Target_C | 2 | Target_C_2 | B |
Target_D | 1 | Target_D_1 | B |
Target_D | 2 | Target_D_2 | B |
Target_E | 1 | Target_E_1 | B |
Target_E | 2 | Target_E_2 | B |
I would like to discover differentially expressed genes for every Target vs Ctrl. Specifically, we want to know how similar the differentially expressed genes of Targets B-E are to Target A's DE genes. (We suspect A interacts with B-E and would like to see how loss of these factors compares with A).
However, the two experiment sets (batch A and B) were performed several weeks apart, and PCA and distance analysis shows Batch B is quite different from both the Ctrl and Target_A samples.
I attempted to use DESeq2 with a design incorporating Batch, but I get an error about the design not being full rank. I don't know what I'm doing with regards to design specification very well.
Do I have any options besides design = ~ Sample when setting up my DESeq object to account for batch effects here? Or are there potentially other options to consider?
Thanks,
Michael Chambers