Question: limma design matrix question
0
3.8 years ago by
United States
akridgerunner20 wrote:

Hello,

We're analyzing a two color microarray experiment with universal human reference RNA in the Cy3 green channel. Cy5 red is hybridized to either a monozygotic diseased twin (SLE_Affected), healthy twin (SLE_Healthy), or a healthy unrelated person (Control). Note I first created an MA list called rdr.ma.between.noctl using both within-array loess normalization and GQuantile bewteen-array normalization.

My targets frame is:

 rowName Cy3 Cy5 GSM591943 Universal SLE_Affected GSM591944 Universal SLE_Healthy GSM591951 Universal SLE_Affected GSM591952 Universal SLE_Healthy GSM591955 Universal SLE_Affected GSM591956 Universal SLE_Healthy GSM591967 Universal SLE_Affected GSM591968 Universal SLE_Healthy GSM591917 Universal SLE_Affected GSM591921 Universal SLE_Healthy GSM591922 Universal SLE_Affected GSM591923 Universal SLE_Healthy GSM591906 Universal Control GSM591933 Universal Control GSM591890 Universal Control

I created the design frame using:

design <- modelMatrix(targets, ref = "Universal")

which outputs:

 rowName CTL SLE_Affected SLE_Healthy GSM591943 0 1 0 GSM591944 0 0 1 GSM591951 0 1 0 GSM591952 0 0 1 GSM591955 0 1 0 GSM591956 0 0 1 GSM591967 0 1 0 GSM591968 0 0 1 GSM591917 0 1 0 GSM591921 0 0 1 GSM591922 0 1 0 GSM591923 0 0 1 GSM591906 1 0 0 GSM591933 1 0 0 GSM591890 1 0 0

I proceed with DE analysis using:

fit <- lmFit(rdr.ma.between.noctl, design)

contrast.matrix <- makeContrasts( SLE_Affected, SLE_Healthy, SLE_Affected-SLE_Healthy,
SLE_Affected-CTL, CTL, levels = design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
lupus.vs.ref <- topTable(fit2, coef=1, adjust="BH", n=nrow(rdr.ma.between.noctl))
healthy.vs.ref <- topTable(fit2, coef=2, adjust="BH", n=nrow(rdr.ma.between.noctl))
lupus.vs.healthy <- topTable(fit2, coef=3, adjust="BH", n=nrow(rdr.ma.between.noctl))
lupus.vs.control <- topTable(fit2, coef=4, adjust="BH", n=nrow(rdr.ma.between.noctl))
control.vs.ref <- topTable(fit2, coef=5, adjust="BH", n=nrow(rdr.ma.between.noctl))

Does this all look correct? Exactly how does limma account for the reference channel? When I look at the contrast for "SLE_Affected-SLE_Healthy" amI really looking at (SLE_Affected-universal) - (SLE_Healthy-universal)?

Also, I tried to create a treatment-contrast parameterization approach as a demonstration, but got completely different results! The targets frame remains the same and the design matrix is:

 arrayID HealthyvsREF SLEvsHealthy GSM591943 1 1 GSM591944 1 0 GSM591951 1 1 GSM591952 1 0 GSM591955 1 1 GSM591956 1 0 GSM591967 1 1 GSM591968 1 0 GSM591917 1 1 GSM591921 1 0 GSM591922 1 1 GSM591923 1 0 GSM591906 0 0 GSM591933 0 0 GSM591890 0 0

Followed by:

design.twins <- targets
design.twins <- design.twins[,-1]
design.twins[,1] <- 1
design.twins[,2] <- gsub("SLE_Affected",1,design.twins[,2])
design.twins[,2] <- gsub("SLE_Healthy",0,design.twins[,2])
design.twins[c(13:51),c(1:2)] <- 0 # Zero out all the controls
colnames(design.twins) <- c("HealthyvsREF","SLEvsHealthy")
design.twins[,1] <- as.numeric(design.twins[,1])
design.twins[,2] <- as.numeric(design.twins[,2])

fit.twins <- lmFit(rdr.ma.between.noctl, design.twins)
fit.twins <- eBayes(fit.twins)
top.treatment <- topTable(fit.twins, coef="SLEvsHealthy", adjust="BH", n=nrow(rdr.ma.between.noctl))

What's wrong here?

Best wishes,

Robert Robl

modified 3.7 years ago by Aaron Lun24k • written 3.8 years ago by akridgerunner20

Show the command you used to create the second design matrix. Also, take advantage of the text formatting options to make a clearer distinction between text and code, otherwise it's really hard to read.

Thanks Aaron, done.

1

On an aesthetic note: if you've got big chunks of code, you might consider the "code" style rather than the "inline code" style. The latter, as its name suggests, is intended for distinguishing code that's inline with text.

2
3.7 years ago by
Aaron Lun24k
Cambridge, United Kingdom
Aaron Lun24k wrote:

I'm not a two-color expert, but the output of modelMatrix looks right to me. The M-values that you're getting from each array represent the log-fold change of your Cy5 sample over the universal reference. As this is true for every array, the reference will cancel out in the final model. In any case, these contrasts:

SLE_Affected-SLE_Healthy 
(SLE_Affected-universal) - (SLE_Healthy-universal)

... are arithmetically equivalent, regardless of the nature of your design matrix. You can see this to be the case; the universal term will just cancel out if you expand past the brackets.

As for your second question, you're missing an intercept term for your control expression:

design.twins <- cbind(1, design.twins) # Do this after design.twins is constructed.

That should give you a design matrix that is effectively equivalent to the one from modelMatrix. Of course, the interpretation of the coefficients will be different, but you should know this already if you're trying to reparameterize the design.

Edit: You may want to consider blocking on the twin pairings, either explicitly in the design matrix or via duplicateCorrelation.