Question

Agilent two-color microarray analysis with limma: 2x3 factorial design

0

Entering edit mode

Natalia • 0

@natalia-23125

Last seen 4.6 years ago

Germany

Hi, I'm currently using limma to analyze an Agilent 2-color microarray data set.

A short summary of the experiment: lung tissues from 8 mice were extracted and split in 6 pieces. Each piece was transferred to one of two different cell culture media (PBS vs. DMEM) and treated with two different concentrations of a drug (100uM, 500uM) or without drug (0uM=negative control). RNA was measured after 6, 12, and 18 hours. So there are 8 biological replicates in a 2x3 factorial design at 3 timepoints.

Each array consists of a treatment-control pair (0uM vs. 100uM or 0uM vs. 500uM) labelled with Cy3 and Cy5, respectively, in a permuted design. Here is the targets description for one timepoint:

> targets
       gpr.file subject.id medium Cy3_treatment Cy5_treatment
Agilent_003.gpr     M-1335   DMEM             0           100
Agilent_004.gpr     M-1342   DMEM             0           100
Agilent_008.gpr     M-1362   DMEM             0           100
Agilent_008.gpr     M-1332   DMEM             0           500
Agilent_010.gpr     M-1343   DMEM             0           500
Agilent_006.gpr     M-1362   DMEM             0           500
Agilent_008.gpr     M-1389   DMEM             0           500
Agilent_011.gpr     M-1332   DMEM           100             0
Agilent_008.gpr     M-1343   DMEM           100             0
Agilent_007.gpr     M-1346   DMEM           100             0
Agilent_006.gpr     M-1345   DMEM           100             0
Agilent_001.gpr     M-1389   DMEM           100             0
Agilent_001.gpr     M-1335   DMEM           500             0
Agilent_002.gpr     M-1346   DMEM           500             0
Agilent_004.gpr     M-1342   DMEM           500             0
Agilent_007.gpr     M-1345   DMEM           500             0
Agilent_005.gpr     M-1335    PBS             0           100
Agilent_009.gpr     M-1346    PBS             0           100
Agilent_003.gpr     M-1342    PBS             0           100
Agilent_004.gpr     M-1345    PBS             0           100
Agilent_006.gpr     M-1389    PBS             0           100
Agilent_010.gpr     M-1332    PBS             0           500
Agilent_010.gpr     M-1343    PBS             0           500
Agilent_011.gpr     M-1346    PBS             0           500
Agilent_007.gpr     M-1389    PBS             0           500
Agilent_003.gpr     M-1332    PBS           100             0
Agilent_009.gpr     M-1343    PBS           100             0
Agilent_011.gpr     M-1362    PBS           100             0
Agilent_008.gpr     M-1335    PBS           500             0
Agilent_008.gpr     M-1342    PBS           500             0
Agilent_011.gpr     M-1345    PBS           500             0
Agilent_007.gpr     M-1362    PBS           500             0

The biological questions that could be asked are: Which genes are differentially expressed when comparing drug treated vs. untreated cells, and when comparing the two different cell culture media, and when comparing different timepoints (or any interactions thereof).

Because there are no arrays comparing the two media DMEM and PBS (unconnected design), I followed the 'Separate Channel Analysis of Two-Color Data' (limma userguide p. 59).

Also, I decided to analyze the three timepoints separately in order to reduce the factorial design complexity. So, e.g. for timepoint 18h, I did:

> targets2 <- targetsA2C(targets.h18)
> u <- unique(targets2$Target)
> f <- factor(targets2$Target, levels=u)
> design <- model.matrix(~0+f)
> colnames(design) <- u
> u
[1] "h18_0_PBS"    "h18_500_PBS"  "h18_100_PBS"  "h18_0_DMEM"   "h18_500_DMEM"
[6] "h18_100_DMEM"

> corfit <- intraspotCorrelation(MA, design)
> fit <- lmscFit(MA, design, correlation=corfit$consensus)

I defined the contrasts matrix similar to section 9.5.1 in the limma userguide, which should answer the three questions:

Which genes respond to drug treatment with two different concentrations in PBS culture?
Which genes respond to drug treatment with two different concentrations in DMEM culture?
Which genes respond differently to drug treatment with two different concentrations in PBS culture compared to DMEM culture?

> cont.matrix <- makeContrasts(
                        PBS100.PBS0=h18_100_PBS-h18_0_PBS,
                        PBS500.PBS0=h18_500_PBS-h18_0_PBS,
                        DMEM100.DMEM0=h18_100_DMEM-h18_0_DMEM,
                        DMEM500.DMEM0=h18_500_DMEM-h18_0_DMEM,
                        PBS0.DMEM0=h18_0_PBS-h18_0_DMEM,
                        PBS100.DMEM100=h18_100_PBS-h18_100_DMEM,
                        PBS500.DMEM500=h18_500_PBS-h18_500_DMEM,
                        diff100=(h18_100_PBS-h18_100_DMEM)-(h18_0_PBS-h18_0_DMEM),
                        diff500=(h18_500_PBS-h18_500_DMEM)-(h18_0_PBS-h18_0_DMEM),
                        levels=design)

> fit2 <- contrasts.fit(fit, cont.matrix)
> fit2 <- eBayes(fit2)
> result <- decideTests(fit2, method="global", adjust.method="BH", p.value=0.05)

The first two contrasts address question 1; the next two contrasts address question 2; and the two interaction terms diff100 and diff500 address question 3. I assume the remaining three contrasts (PBS0.DMEM0, PBS100.DMEM100, PBS500.DMEM500) address a possible additional question: Which genes differ beween PBS culture and DMEM culture with no drug treatment or identical drug treatment?

Now finally to my actual questions:

Do you think the design and contrasts matrix are ok for this experiment?
I'm not sure if biological replicates are taken into account the right way - I wasn't able to find a way to include blocking per mouse.

I would be very grateful for any comment or suggestion.

Natalia

> sessionInfo()

R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/libf77blas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] AnnotationHub_2.16.1   BiocFileCache_1.8.0    dbplyr_1.4.2          
 [4] BiocGenerics_0.30.0    clusterProfiler_3.12.0 stringr_1.4.0         
 [7] optparse_1.6.4         viridis_0.5.1          viridisLite_0.3.0     
[10] limma_3.40.6

limma microarray contrast matrix 2x3 factorial design differential expression • 1.6k views

ADD COMMENT • link updated 4.7 years ago by Gordon Smyth 52k • written 4.7 years ago by Natalia • 0

0

Entering edit mode

Does anyone have any ideas?

ADD REPLY • link 4.7 years ago Natalia • 0

Gordon Smyth · Answer 1 · 2020-04-22

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 10 hours ago

WEHI, Melbourne, Australia

Because there are no arrays comparing the two media DMEM and PBS (unconnected design), I followed the 'Separate Channel Analysis of Two-Color Data'

That's ok, but you didn't need to do so. You are not making any direct comparisons between DMEM and PBS, so you could have performed a standard two-color analysis.

Do you think the design and contrasts matrix are ok for this experiment?

Looks ok. You could consider adding a probe-specific dye effect:

Dye <- factor(targets$channel.col)
design <- model.matrix(~0+f+Dye)

Which contrasts you choose to make is up to you.

I'm not sure if biological replicates are taken into account the right way - I wasn't able to find a way to include blocking per mouse.

You could add subject.id to the model: model.matrix(~0+f+subject.id), but that would result in some loss of information because each mouse contributes only 4 samples, not 6. I would only do so if there are really noticeable baseline differences between the mice as shown in an MDS plot.

ADD COMMENT • link 4.7 years ago Gordon Smyth 52k

0

Entering edit mode

Dear Gordon, thank you very much for your suggestions, that's really helpful. I have two follow-up questions:

That's ok, but you didn't need to do so. You are not making any direct comparisons between DMEM and PBS, so you could have performed a standard two-color analysis.

My contrast.matrix contains some PBS-vs-DMEM comparisons:

PBS0.DMEM0=h18_0_PBS-h18_0_DMEM
PBS100.DMEM100=h18_100_PBS-h18_100_DMEM
PBS500.DMEM500=h18_500_PBS-h18_500_DMEM

So I thought that I acutally do direct comparisons between DMEM and PBS, or am I missing something?

You could add subject.id to the model: model.matrix(~0+f+subject.id), but that would result in some loss of information because each mouse contributes only 4 samples, not 6.

Could you please explain why there would be no information loss when each mouse would contribute 6 samples instead of 4?

Thanks again, Natalia

ADD REPLY • link updated 4.7 years ago by Gordon Smyth 52k • written 4.7 years ago by Natalia • 0

0

Entering edit mode

You're right, I missed the three DMEM vs PBS comparisons.

My comment about loss of information was also incorrect. I had assumed that there were six treatment groups (not counting the controls) but I see now that there are only 4. So the design is essentially balanced and adding subject.id as a blocking variable would make good sense.

ADD REPLY • link 4.7 years ago Gordon Smyth 52k