I would like to compare expression levels of orthologous genes in different species for the purpose of conducting a co-expression analysis such as WGCNA.
I am using DESeq2 with the the following RNA-seq data:
6 different species (the species are from the same genus and phylogentically related)
2 different conditions
What would be the ideal experimental design to generate comparable expression levels between the orthologous genes? And How do I test whether the normalized expression levels are indeed comparable between species.
What I did so far:
- Used orthoMCL to identify single copy orthologous genes
- Generated a matrix with raw counts of all single copy orthologous genes for each of the 36 samples
- Normalized the counts using DESeq2, using the following design: ~ condition + species
I'm, thinking that this might be an issue since the model doesn't know that there is quite some variation in genome size and number of genes between the species, and it doesn't know the sequencing depth since only a subset of the RNA-seq data is being used.
Any suggestions on how to improve this?