edgeR design matrix
1
0
Entering edit mode
@cafelumiere12-7513
Last seen 6.1 years ago
United States

So I have the following samples for differential expression analysis and I'm hoping to see see if my design matrix makes sense. There are cell samples from three different donors each gone through 2 different cell culturing processes and 5 different treatments. The goal is to look at the differences between different treatments and also between different processes as well. Samples that gone through process A have data for all 5 treatments, while samples that gone through process B only have data for 2 of the 5 treatments. Is the design matrix here the right construction? Thanks a lot!

sampleInfo <- read_csv(<samplemanifest_csvfile>,col_names=TRUE
Donor <- factor(sampleInfo$Donor)
Treatment <- factor(sampleInfo$Treatment)
Process <- factor(sampleInfo$Process)
design <- model.matrix(~0+Treatment+Process+Donor)
Donors Process Treatment
P01 A 1
P01 A 2
P01 A 3
P01 A 4
P01 A 5
P02 A 1
P02 A 2
P02 A 3
P02 A 4
P02 A 5
P03 A 1
P03 A 2
P03 A 3
P03 A 4
P03 A 5
P01 B 2
P01 B 5
P02 B 2
P02 B 5
P03 B 2
P03 B 5
edgeR design and contrast matrix design matrix • 1.6k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 57 minutes ago
WEHI, Melbourne, Australia

Well, you are asking a biological question rather than a computing question.

Personally, I think it is unlikely that Process and Treatment have additive effects. It would be more usual to assume that they might interact. The most usual limma analysis for this type of experiment would allow general interactions between Treatment and Process:

ProcTreat <- paste(sampleInfo$Process, sampleInfo$Treatment, sep=".")
design <- model.matrix(~0+ProcTreat+Donor)

Note that I have given you almost exactly the same advice before for a slightly different experiment, see: edgeR design matrix and contrasts: how to make contrast between groups that aren't shown in the design matrix columns?

ADD COMMENT
0
Entering edit mode

Thank you very much!  Yes, I was actually reading your previous answer earlier and thought about using what you suggested here ( similar to before as well). The only thing though, is that the scientist also wanted to look at differences "between processes". So I thought maybe I should make the design matrix in a way that I can make contrast that I can directly analyze the differences between Process A and Process B... thus making the design matrix: model.matrix(~0+Treatment+Process+Donor).

- Does this mean that this way the contrast (Process A-Process B) I'm not separating treatments and looking all the treatments together?

- If I use model.matrix(~0+ProcTreat+Donor) , kind of following the question above, would you think it is more correct to look at differences between processes within the same treatment?

On a side note, I see that most of the variability here actually came from different donors.

thanks very much again.

ADD REPLY
0
Entering edit mode

Yes, it is generally more meaningful to compare Processes for the same Treatment.

Comparing Process B to Process A using your old model was confounding differences between processes with differences between Treatments, because Treatment 1 was not used with Process B. There were other problems as well.

There is no such thing as "separating treatments". You can't compare processes as if treatments didn't exist.

ADD REPLY
0
Entering edit mode

Thank you very much for your help!

ADD REPLY

Login before adding your answer.

Traffic: 681 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6