Hello all,
We have two variables from RNA-seq data: tissue (tissue1 and tissue2) and treatment (healthy and unhealthy). There are in total 4 healthy and 4 unhealthy animals, from which samples from tissues 1 and 2 were collected from each animal. We would like to perform a differentially expression gene analysis between healthy and unhealthy animals, considering the tissue effect (both tissues) in the same analysis. In 2015, we used EdgeR (version 3.2.2) to perform DE genes analysis between tissue 1 and tissue 2 only. At that time, EdgeR was not able to perform the DE analysis between healthy and unhealthy animals, considering the both tissues effect in the analysis, so we used DESeq2 for this purpose. Now we are considering to publish these data and the better would be that both analysis (tissue1 X tissue2) and (healthy X unhealthy, considering the tissue effect) have been performed by the same bioconductor package. My question is: does anyone know if EdgeR 2018 current version (3.5) is now able to perform the multi-factor with interaction model in the same way as DESeq2? My knowledge about these software is initial and I read the manuals but nothing is reported about any improvements or changes in EdgeR models during this period.
Thank you all.
Thank you very much for your prompt reply, Michael.
I would like to you ask you a couple of more questions:
-Was EdgeR already able to do this analysis since 2015?
-Based on you previous answer, it seems I will have to do the analysis again. I am studying the EdgeR guide and I guess the new analysis is something similar to section 4.4 RNA-Seq profiles of mouse mammary gland. Am I in the right track? in case not, I would appreciate very much if you could please suggest me some material to look for the proper analysis.
Thank you for your time and attention.
Camilla
I'll chip in here - yes, edgeR was able to do this in 2015. The capability was there when I started working on edgeR in earnest (2013, maybe?), at which time I migrated all of the existing R code to use C++.
As for the specific analysis; I also suspect that you have not been performing your analysis correctly. This is not a simple multi-factor model as the tissues come from the same animal. A full model would probably look like:
... for which you cannot directly compare between healthy unhealthy directly. You can only test for tissue differences within each status level, or compare "differences of differences" between tissue and status, i.e., differences in tissue-to-tissue differences between status levels.
If you want to compare directly between, e.g., healthy tissue 1 and unhealthy tissue 1, you need to subset your data so that each animal only contributes one sample. This ensures that there are not any correlations between the samples. Such correlations cannot be modelled by adding blocking factors, as then you would not be able to perform the desired contrasts.
Alternatively, you could use limma and
voom
withduplicateCorrelation
, which avoids the need to discard any samples. Of course, all this assumes that a per-animal effect exists; if it doesn't, you can just use a multi-factor model with status and tissue, which can be accommodated by all popular DE analysis methods.