I have the below data:
Samples Genotype treatment Time
D2_WT_mock1 WT mock D2
D2_WT_mock2 WT mock D2
D2_WT_mock3 WT mock D2
D2_KO_mock1 KO mock D2
D2_KO_mock2 KO mock D2
D2_KO_mock3 KO mock D2
D2_WT_inf1 WT inf D2
D2_WT_inf2 WT inf D2
D2_WT_inf3 WT inf D2
D2_KO_inf1 KO inf D2
D2_KO_inf2 KO inf D2
D2_KO_inf3 KO inf D2
D6_WT_mock1 WT mock D6
D6_WT_mock2 WT mock D6
D6_WT_mock3 WT mock D6
D6_KO_mock1 KO mock D6
D6_KO_mock2 KO mock D6
D6_KO_mock3 KO mock D6
D2_WT_inf1 WT inf D6
D6_WT_inf2 WT inf D6
D6_WT_inf3 WT inf D6
D6_KO_inf1 KO inf D6
D6_KO_inf2 KO inf D6
D6_KO_inf3 KO inf D6
I have three different conditions, and would like to look at different comparisons for differential expressed genes. I combined the factors for easier interpretation. However when I look at the PCA plot (see attached), I do see a lot of variation within D6.
My goal is to see differences between genotypes based on treatment (using interaction terms), and main effects as well (for D2 and D6).
So my question is in this multi factor comparison, should I Run DEseq2 with all samples and remove the D6 ones by droplevels or run DEseq2 on D2 and D6 samples separately and make interpretations. Thanks.
If deseq2 is done together I could probably do comparisons between D2 and D6, if not that is alright if I just do these two separate.
When I combine the factors to make interpretations easier, then I have several levels to compare via pairwise comparisons
Will it be biased if I do deseq of only D2 samples, and then drop levels of D6, and interpret following way
For D2,
The effect of infection in wild type
The effect of infection in knockouts
What is the difference between knock out and wild-type without infection?
this answers with treatment, what is the difference between knockout and wild-type?
Similarly for D6, I will do the same list of comparisons?
Yes, these comparisons you list are fine, to perform these in D2 and D6 separately.