Question: Correct setup of a design matrix for a two-color, interwoven microarray loop design with dye swaps
0
gravatar for mwesthues
3.8 years ago by
mwesthues0
Germany
mwesthues0 wrote:

This question has migrated from https://www.biostars.org/p/153362/ to this website.

I am looking for the correct way to set up a design matrix for a two-color. interwoven microarray loop design with dye swaps. So far, I have only found thoroughly described/discussed examples that use a common reference sample in the call to the "modelMatrix()" function in limma.

 

ADD COMMENTlink modified 3.8 years ago by James W. MacDonald50k • written 3.8 years ago by mwesthues0

I applaud your attempt at comprehensiveness in posing a question, but in trying to be comprehensive, you have failed to give us much useful information!

  1. Do you have biological replicates? If so, which samples are replicates? F001, etc are needlessly obscure.
  2. What are your goals? If you are using limma, it's already obvious that you are fitting a linear model, so you don't need to tell anybody that. Your goals aren't to fit a linear model, but instead are to determine differences between one or more pairs of sample types (or maybe combinations thereof). What comparisons are you interested in making?
ADD REPLYlink written 3.8 years ago by James W. MacDonald50k

Thanks for your helpful remarks! I have added information on the experimental design and the goal of our analysis.

1. We do not have any biological replicates, just technical replicates of RNA sources.

2. Our goal is to compute "adjusted" gene expression values for each RNA source.

 

ADD REPLYlink written 3.8 years ago by mwesthues0
Answer: Correct setup of a design matrix for a two-color, interwoven microarray loop des
1
gravatar for James W. MacDonald
3.8 years ago by
United States
James W. MacDonald50k wrote:

Your goal is an unconventional one for the microarray context, especially a two-color microarray. In other words, an expression value really doesn't mean anything by itself, but is only useful when compared to the expression of the same gene in a different sample. In addition, standard errors of technical replicates only tell you how reproducible the technology is, not how variable the gene expression is, across your sample (and presumably across the population of interest).

Ideally you would analyze this as a two-color array, in which case you would compute (and use) the log ratios that are inherent to this sort of design. Do note that starting on p. 38 (and again on p 54) of the limma User's Guide, there are examples that directly relate to what you are doing. But this will not result in individual gene expression values for each gene, but instead will give you log ratios.

If you are truly interested in expression estimates for each gene, then you likely have to do a separate analysis of two-color data, which is covered in the limma User's Guide starting on p. 58.

ADD COMMENTlink written 3.8 years ago by James W. MacDonald50k
  1. I will try a separate channel analysis of the two-color data and check whether that works for me.
  2. Regarding the direct two-color design, I am still lacking the understanding of the set-up of the design matrix. In the limma User's guide (p.38 & p.54) a common reference, which I do not have, is used. Instead we used an interwoven loop design. Hence my original question on the design matrix for such a direct two-color design with dye swaps. Do you have any suggestions?
ADD REPLYlink written 3.8 years ago by mwesthues0
1

On page 38, the design is NOT a common reference design. Did you read that page? I am not sure why you think it is a common reference design. Note that there is a difference between using a common reference on your arrays and specifying a sample type to use as a reference when you are fitting a model.

The main idea behind using a two-color array is that a split-plot design is inherently more powerful because you can more accurately account for technical variability. And in a microarray context, where the goal is usually to compare gene expression between samples (rather than trying to estimate gene expression), the fact that the data are all log ratios is not a hardship.

Even with an interwoven design you can simply say that one sample is the reference, so all your coefficients then estimate the log ratio between a given sample and the reference. And making any other comparison is as simple as computing the log ratio between the coefficients. In other words

Coef1 = log(F002/F001)

Coef2 = log(F003/F001)

log(F003/F002) = Coef2 - Coef1

So specifying a reference in this context has nothing to do with how the arrays were hybridized, and everything to do with how you plan to make comparisons, and which comparisons are more convenient.

ADD REPLYlink written 3.8 years ago by James W. MacDonald50k

Many thanks for the clarification! I will try as you have suggested.

 


 

ADD REPLYlink written 3.8 years ago by mwesthues0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 336 users visited in the last hour