This is a follow-up on a previous question (to be found here: C: Adjusting for procedural effect and deciding on statistical test )
I am trying to analyze an RNA seq dataset with EdgeR. I have 44 different samples, coming from two controls (both the full organ, namely a plant root) and five different cell types. I have 6 samples for all but one of the controls, for which I have 8 samples. Half of the samples is treated, the other is not. We are interested in the effect of this treatment.
To get the individual cell types, we had to sort our protoplasts. This means that all the cell type samples are sorted. We also sorted one of the controls (the one with 8 samples), while we did not sort the other control. We hope to be able to use these controls to correct for the sorting effect when analyzing our data for treatment-induced effects with EdgeR. To summarize, this is our experimental design:
tissue <- c(rep("control_wholeRoot",6),rep("control_sorted", 8), rep(c("type1", "type2", "type3", "type4", "type5"), each=6))
group <- factor(paste0(tissue, ".", sorting, ".", treatment))
This design is used to construct the fit object from the digital gene expression list (DGEList) that I made from my counts table.
#make the DGEList
#filter out lowly expressed genes
#fit the data
I then make contrasts to compare the treated vs the non-treated sample of each sample type (each of the two controls and each of the 5 cell types). This gives 7 lists of all genes with the FC in the specific sample due to the treatment.
We want to check whether including 'sorting' in our design matrix actually does anything. To check this, we reran the script, but without the 'sorting' in 'group'. We then made a scatterplot of FC of the genes in the unsorted control vs the FC of the same genes in the sorted control from both the resulting datasets.
We expected the scatterplot made from the dataset with 'sorting' in 'group' to show points that fall more along the identity line (and resulted in a larger R squared) then the scatterplot made from the dataset without 'sorting' in 'group'.
However, we got two graphs that are exactly the same. Are we missing something here and should we do a different check or is it indeed bad that the two graphs are the same?
Any help would be highly appreciated,
Eline Verbon and Ronnie de Jonge