Question

edgeR design matrix

0

Entering edit mode

cafelumiere12 ▴ 20

@cafelumiere12-7513

Last seen 6.1 years ago

United States

So I have the following samples for differential expression analysis and I'm hoping to see see if my design matrix makes sense. There are cell samples from three different donors each gone through 2 different cell culturing processes and 5 different treatments. The goal is to look at the differences between different treatments and also between different processes as well. Samples that gone through process A have data for all 5 treatments, while samples that gone through process B only have data for 2 of the 5 treatments. Is the design matrix here the right construction? Thanks a lot!

sampleInfo <- read_csv(<samplemanifest_csvfile>,col_names=TRUE
Donor <- factor(sampleInfo$Donor)
Treatment <- factor(sampleInfo$Treatment)
Process <- factor(sampleInfo$Process)
design <- model.matrix(~0+Treatment+Process+Donor)

Donors	Process	Treatment
P01	A	1
P01	A	2
P01	A	3
P01	A	4
P01	A	5
P02	A	1
P02	A	2
P02	A	3
P02	A	4
P02	A	5
P03	A	1
P03	A	2
P03	A	3
P03	A	4
P03	A	5
P01	B	2
P01	B	5
P02	B	2
P02	B	5
P03	B	2
P03	B	5

edgeR design and contrast matrix design matrix • 1.6k views

ADD COMMENT • link updated 6.1 years ago by Gordon Smyth 50k • written 6.1 years ago by cafelumiere12 ▴ 20

score 0 · Answer 1 · 2018-03-17

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 57 minutes ago

WEHI, Melbourne, Australia

Well, you are asking a biological question rather than a computing question.

Personally, I think it is unlikely that Process and Treatment have additive effects. It would be more usual to assume that they might interact. The most usual limma analysis for this type of experiment would allow general interactions between Treatment and Process:

ProcTreat <- paste(sampleInfo$Process, sampleInfo$Treatment, sep=".")
design <- model.matrix(~0+ProcTreat+Donor)

Note that I have given you almost exactly the same advice before for a slightly different experiment, see: edgeR design matrix and contrasts: how to make contrast between groups that aren't shown in the design matrix columns?

ADD COMMENT • link 6.1 years ago Gordon Smyth 50k

0

Entering edit mode

Thank you very much! Yes, I was actually reading your previous answer earlier and thought about using what you suggested here ( similar to before as well). The only thing though, is that the scientist also wanted to look at differences "between processes". So I thought maybe I should make the design matrix in a way that I can make contrast that I can directly analyze the differences between Process A and Process B... thus making the design matrix: model.matrix(~0+Treatment+Process+Donor).

- Does this mean that this way the contrast (Process A-Process B) I'm not separating treatments and looking all the treatments together?

- If I use model.matrix(~0+ProcTreat+Donor) , kind of following the question above, would you think it is more correct to look at differences between processes within the same treatment?

On a side note, I see that most of the variability here actually came from different donors.

thanks very much again.

ADD REPLY • link 6.1 years ago cafelumiere12 ▴ 20

0

Entering edit mode

Yes, it is generally more meaningful to compare Processes for the same Treatment.

Comparing Process B to Process A using your old model was confounding differences between processes with differences between Treatments, because Treatment 1 was not used with Process B. There were other problems as well.

There is no such thing as "separating treatments". You can't compare processes as if treatments didn't exist.