Hello, I `ve conducted DEG analysis by Deseq2 and I just want to make sure that I´m right. I have to following information for the design:
info:
region: region1, region 1, region 2,
species: species1, species 2, species1
group: region1_species1, region1_species2, region2_species1
(unfortunately, region2_species 2 is missing!)
--> So I have both species from one region, but only species1 in region2 and I want to test for species and region. Within each region/species I have 5 replicates.
Is it ok to use the combined factor (group) to test for my comparisons of interest:
--> differences in species: region1_species1 vs region1_species 2
--> differences in region: region1_species1 vs region2_species1
dds_all<- DESeqDataSetFromMatrix(countData = matrix,
colData = info,
design = ~ group)
Or is it better to use a nested design (somethink like: ~species+region)?
What I can see so far is a huge effect of species, so it makes sense to separate them (e.g. to test for region within respective species.
Hi Michael, I totally understand. I´m asking since both options (summarizing factors to one & nested design) seems to be ok from statistical point of view. But it would be nice to get an opinion about the design which does not contain "equal" (=region2_species2 is missing) factor levels; I was thinking to get support here for such "special cases".
I just really don't have any extra time to consult in this way on the support site. Sorry!