Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.2 years ago
Hi all,
Could somebody explain the process used in developing the design
matrix for two channel microarray experiments in Limma; in particular,
those given for each experiment in Figure 1 in the empirical Bayes
paper (http://www.statsci.org/smyth/pubs/ebayes.pdf).
For single channel arrays, the design matrix seems to assume the form
of standard linear model design matrices; that is, 1 where an array
treatment is present and 0 otherwise. From here, the resulting model
parameters can be tested with the implementation of an appropriate
contrast matrix (where, typically, each contrast effect sums to zero).
This does not appear to be the case for two-channel experiments.
In the above paper, the aforementioned experiments are given in Kerr
and Churchill arrow notation (where the arrow head points toward the
RNA sample labelled with red dye and the sample at the arrow base is
labelled green).
The experiments can be summarised as follows:
(a)
Red Green
RNA1 RNA2
For this experiment, it seems to me that only parameter of interest
(let's call it mu1) is the response value (or mean of the response
values if we have more than one identical replicate); because the
response is estimated by the (mean of) the log2 fold change between
red and green channels, in this instance, the design "matrix" is
simply (1); this becomes a column of 1 values if there is more than
one identical replicate.
(b)
Red Green
RNA1 RNA2
RNA2 RNA1
In this experiment, although there are two arrays, similarly to in
experiment (a), it seems that there is only one comparison of interest
(namely, the difference between RNA1 and RNA2); because the dyes in
the second array are inverted (relative to the first array in the
experiment), the ratio, too, is inverted. Inverting the term inside
the logarithm will yield a response which is the negative of the
response from the first replicate (i.e. log2(RNA2/RNA1) =
-log2(RNA1/RNA2)); therefore, the second replicate will yield the
negative relative of the response from the first replicate. For
consistency, we must multiply the response value by -1. As a result,
we have the design matrix: (1, -1).
I'm confused about how the design matrices are formed for experiments
in (c) and (d).
In (c), RNA1 and RNA2 are compared through a common reference.
(c)
Red: Green:
Ref RNA1
RNA1 Ref
RNA2 Ref
The design matrix is given by (-1 0; 1 0; 1 1) -- where ";" denotes
the end of the matrix row; the first coefficient estimates the
difference between the RNA1 and the reference sample, whilst the
second coefficient estimates the the difference between RNA1 and RNA2.
Experiment (d) is a saturated direct design comparing three samples.
(d)
Red Green
B A
A C
C B
The design matrix is given by (1 0; 0 1; -1 -1); where the first
coefficient compares the difference between B - A and the second
coefficient compares the difference between C - B.
Also, on page 39 of the Limma user guide (http://www.bioconductor.org/
packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf), you
can find a design and contrast matrix for a direct two-colour design.
The experiment compares CD4, CD8 and DN. I'm not really sure how this
design/contrast works.
Explanation of the above structures would be greatly appreciated.
Joseph
-- output of sessionInfo():
--
--
Sent via the guest posting facility at bioconductor.org.