Question

limma & design matrix: for 2-color dyes and 4-comparisons?

0

Entering edit mode

Mehmet Ilyas Cosacak • 0

@mehmet-ilyas-cosacak-9020

Last seen 6.2 years ago

Germany/Dresden/ CRTD - DZNE

Hi Everybody,

I have a question regarding limma design matrix. I searched for "limma and design" but could not find an answer for my question. I will be happy if you could send me the link if a similar question has been answered previously!

I am re-analyzing a data set from GEO (GSE74615) to compare with some published microarray data and an RNA-Seq that we have.

I generated the target file as bellow:

FileName   Cy3   Cy5
US12302316_252665511011_S01_GE2-v5_95_Feb07_1_1.txt   microglia_WT   microglia_AD
US12302316_252665511011_S01_GE2-v5_95_Feb07_1_2.txt   microglia_AD   microglia_WT
US12302316_252665511011_S01_GE2-v5_95_Feb07_1_3.txt   microglia_WT   microglia_AD
US12302316_252665511011_S01_GE2-v5_95_Feb07_1_4.txt   microglia_AD   microglia_WT
US12302316_252665511015_S01_GE2-v5_95_Feb07_1_1.txt   microglia_WT   microglia_AD
US12302316_252665511015_S01_GE2-v5_95_Feb07_1_2.txt   microglia_AD   microglia_WT
US12302316_252665511015_S01_GE2-v5_95_Feb07_1_3.txt   microglia_AD   microglia_WT
US12302316_252665511016_S02_GE2-v5_95_Feb07_1_1.txt   astrocytes_AD   astrocytes_WT
US12302316_252665511016_S02_GE2-v5_95_Feb07_1_2.txt   astrocytes_WT   astrocytes_AD
US12302316_252665511016_S02_GE2-v5_95_Feb07_1_3.txt   astrocytes_WT   astrocytes_AD
US12302316_252665511016_S02_GE2-v5_95_Feb07_1_4.txt   astrocytes_AD   astrocytes_WT

What I want to do for DE gene analysis is to compare the followings:

1) microglia_AD vs microglia_WT

2) astrocytes_AD vs astrocytes_WT

3) astrocytes_AD vs microglia_AD

4) astrocytes_WT vs microglia_WT

I did the first 2 comparisons with the script below

library("limma") wtFun <- function(x) as.numeric(x$ControlType == 0) targets <- readTargets() microglia <- targets[!(targets$Cy3 %in% "astrocytes_AD") & !(targets$Cy5 %in% "astrocytes_AD"),] astrocytes <- targets[!(targets$Cy3 %in% "microglia_AD") & !(targets$Cy5 %in% "microglia_AD"),] #1) microglia_AD vs microglia_WT RG <- read.maimages(microglia, source = "agilent", annotation = "ProbeName", wt.fun = wtFun) RG <- backgroundCorrect(RG, method = "normexp", offset = 50) MA <- normalizeBetweenArrays(RG, method = "quantile") MA <- avereps(MA, ID = MA$genes$ProbeName) design <- modelMatrix(microglia, ref = "microglia_WT") fit <- lmFit(MA, design) fit <- eBayes(fit, trend = TRUE) allResults <- topTable(fit, n = Inf)

#2) astrocytes_AD vs astrocytes_WT RG <- read.maimages(astrocytes, source = "agilent", annotation = "ProbeName", wt.fun = wtFun) RG <- backgroundCorrect(RG, method = "normexp", offset = 50) MA <- normalizeBetweenArrays(RG, method = "quantile") MA <- avereps(MA, ID = MA$genes$ProbeName) design <- modelMatrix(astrocytes, ref = "astrocytes_WT") fit <- lmFit(MA, design) fit <- eBayes(fit, trend = TRUE) allResults <- topTable(fit, n = Inf)

is there an easier way to design a design matrix that can do all the comparisons?

thank you very much in advance,

ilyas.

limma microarray limma design matrix • 1.3k views

ADD COMMENT • link updated 7.8 years ago by Gordon Smyth 50k • written 7.8 years ago by Mehmet Ilyas Cosacak • 0

1

Entering edit mode

Note first that there was a typo on the 10th line of your targets file, which had "astroyctes" instead of "astrocytes". I have edited your post to fix that, and you should obviously make sure this is correct in your analysis as well.

ADD REPLY • link 7.8 years ago Gordon Smyth 50k

score 4 · Accepted Answer · 2016-06-29

4

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

The experimental design does not include any direct comparisons between astrocytes and microglia, so you cannot make comparisons 3 and 4 in a conventional two-color microarray analysis. To make the first two comparisons, I would do it like this:

library("limma")
targets <- readTargets("targets.txt")
RG <- read.maimages(targets, source = "agilent")
keep <- RG$genes$ControlType==0
RG2 <- RG[keep,]
RG2 <- backgroundCorrect(RG2, method = "normexp", offset = 16)
MA <- normalizeWithinArrays(RG2, method = "loess")
dm <- modelMatrix(targets[1:7,], ref="microglia_WT")
da <- modelMatrix(targets[8:11,], ref="astrocytes_WT")
design <- cbind(Dye=1, blockDiag(dm,da))
keep <- rowMeans(MA$A) > 6
fit <- lmFit(MA[keep,], design)
fit <- eBayes(fit, trend=TRUE, robust=TRUE)
topTable(fit, coef="microglia_AD")
topTable(fit, coef="astrocytes_AD")

Note: I made a correction to the design matrix in the above code 14 hours after my original post.

To make all four comparisons that you want to do, it is necessary to undertake a separate channel analysis instead:

targetsC <- targetsA2C(targets)
group <- factor(targetsC$Target)
group <- relevel(group,ref="microglia_WT")
designsc <- model.matrix(~0+group)
colnames(designsc) <- levels(group)
MA <- normalizeWithinArrays(RG2, method = "loess")
MA <- normalizeBetweenArrays(MA, method="Aquantile")
keep <- rowMeans(MA$A) > 6
isc <- intraspotCorrelation(MA[keep,], designsc)
fitsc <- lmscFit(MA[keep,], designsc, isc$consensus)
cont.matrix <- makeContrasts(
   mAD.mWT = microglia_AD - microglia_WT,
   aAD.aWT = astrocytes_AD - astrocytes_WT,
   aAD.mAD = astrocytes_AD - microglia_AD,
   aWT.mWT = astrocytes_WT - microglia_WT,
   levels = designsc
)
fit2 <- contrasts.fit(fitsc, cont.matrix)
fit2 <- eBayes(fit2, trend=TRUE, robust=TRUE)
topTable(fit2, coef="mAD.mWT")
topTable(fit2, coef="aAD.aWT")
topTable(fit2, coef="aAD.mAD")
topTable(fit2, coef="aWT.mWT")

Note that the separate channel analysis is more powerful and will generally give smaller p-values than the conventional analysis for the same comparisons.

ADD COMMENT • link 7.8 years ago Gordon Smyth 50k

0

Entering edit mode

Hi Gordon,

Thank you very much again and again for your help and answers!

It looks a bit complicated how to generate the design matrix. I will try to understand the logic behind.

However, after running the script the topTable(fit2, coef="mAD.mWT") and topTable(fit, coef="mAD") give different results! At least the top genes differ from each other! Do you think there is mistake here?

best,

ilyas.

topTable(fit2, coef="mAD.mWT")
      Row Col Start                                                     Sequence ProbeUID ControlType     ProbeName    GeneName SystematicName
5127 114 14     0 TGATATTGATGGAGTCTTACCTCTCTGAAACCTTAAGCCCAAATAAATCCTTCCTTCTAT     8485           0 A_55_P2005320      Tm7sf4      NM_029422
>
> topTable(fit, coef = "mAD")
      Row Col Start                                                     Sequence ProbeUID ControlType     ProbeName GeneName SystematicName
21411 257 69     0 ATGGATTCTACTCTTCCCTCTTTGCCTTGGTGAGGCAGCCCAGGTCGCTGCAGCATCTCT    19358           0 A_55_P2044710    Asb10      NM_080444

sessionInfo()
R version 3.2.5 (2016-04-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8.1 x64 (build 9600)

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] limma_3.26.9

loaded via a namespace (and not attached):
[1] splines_3.2.5 statmod_1.4.24

ADD REPLY • link 7.8 years ago Mehmet Ilyas Cosacak • 0

1

Entering edit mode

Yes, you're right. The separate channel analysis was correct but I created the design matrix incorrectly for the conventional analysis. I have edited the code so that it now correct.

ADD REPLY • link 7.8 years ago Gordon Smyth 50k

0

Entering edit mode

Thanks a lot Gordon! I really appreciate your help!

best, ilyas.

ADD REPLY • link 7.8 years ago Mehmet Ilyas Cosacak • 0