The experiment design is the following. There are two groups (NR and RE) and each comes in triplicates. These triplicates were grown on a dish. At timepoint 0 (T0) samples were collected. Afterwards, they were treated (RAP) or not (Co) and at timepoint 6 (T6), samples were collected again. Here is the condition table and a PCA (with and without samples correction) for the overview.
|
Cond |
Time |
C |
|
NR_Co
|
T0 |
T0_NR_Co |
|
NR_Co |
T0 |
T0_NR_Co |
|
NR_Co |
T0 |
T0_NR_Co |
|
RE_Co |
T0 |
T0_RE_Co |
|
RE_Co |
T0 |
T0_RE_Co |
|
RE_Co |
T0 |
T0_RE_Co |
|
NR_Co |
T6 |
T6_NR_Co |
|
NR_Co |
T6 |
T6_NR_Co |
|
NR_Co |
T6 |
T6_NR_Co |
|
RE_Co |
T6 |
T6_RE_Co |
|
RE_Co |
T6 |
T6_RE_Co |
|
RE_Co |
T6 |
T6_RE_Co |
|
NR_RAP |
T6 |
T6_NR_RAP |
|
NR_RAP |
T6 |
T6_NR_RAP |
|
NR_RAP |
T6 |
T6_NR_RAP |
|
RE_RAP |
T6 |
T6_RE_RAP |
|
RE_RAP |
T6 |
T6_RE_RAP |
|
RE_RAP |
T6 |
T6_RE_RAP |
https://ibb.co/eeqc2U
https://ibb.co/d0xDbp
https://ibb.co/bRCDbp
For the following questions I decided to do a grouping of Cond and Time because it’s best to answer the following questions
dds$C ← merge(dds$Time, dds$Cond)
a) Are there no differences between T6_NR_Co and T6_NR_RAP?
b) Are there no differences between T6_RE_Co and T6_RE_RAP?
c) Are there no differences between T6_NR_Co and T6_RE_Co?
d) Are there no differences between T6_NR_RAP and T6_RE_RAP?
The problem here is, that I can’t control for the samples itself because the design ~Id+C causes an error “Error in checkFullRank(modelMatrix) :”.
I can only use ~ C which I think is not appropriate
Therefore I was wondering if it would be best to split the data set into RE samples and NR samples? This would make it possible to answer a) and b) and use the whole dataset for c) and d)
Just as a confirmation if I want to look at the effect of the treatment on the two groups I would need to use the interaction design: Type + Treatment + Type:Treatment
And my last question is for the following
If I do a comparison of
e) T6_NR_Co vs T0_NR_Co
f) T6_NR_RAP vs T0_NR_Co
I get about e) 1,500 genes and f) 2,000 DEGs. An overlap tells me that 50% are DE in e) and f), 15 % are only in e) and the rest in f). But a comparison of T6_NR_RAP vs T6_NR_Co gives me 0 DEGs which means there is no difference between RAP and Co for T6, even though e) and f) show DEGs. I have to mention as well that for T6_NR_RAP vs T6_NR_Co the pvalue histogram shows a curve towards 1 and the padj values are all identical being close to 1.
What would be the best design and contrast to ask for differences that only come from RAP over time? Could I only use the 35% from the overlap between e) and f), even though these are not DEGs for T6_NR_RAP vs T6_NR_Co.
Thanks
Mathias
Can you explain what you mean by "control for the samples itself"? You mean controlling for donor as listed above?
Yes. Sorry I meant Donor and not samples. The comparison T6_RE_Co vs T6_NR_RAP should need a the design Donor + C because the same Donors are in both Treatments.