Question

Design matrix for a two-color experimental design testing for time effect on 3 treated cell-lines, each with its own time=0 control

0

Entering edit mode

rubi ▴ 110

@rubi-6462

Last seen 5.7 years ago

Hi,

I did microarray gene-expression profiling on the following experimental design:

I took three cell lines and plated each one of them in four plates, hence a total of 12 plates.

I purified RNA from a single plate of each of the three cell lines at time=0, then X-ray irradiated the remaining plates, and then for each cell-line purified RNA at three different time points (24h, 48h, and 72h), i.e. a plate per time-point.

For each cell-line, the reference channel in the microarray is its corresponding non-irradiated sample.

The cell-lines in this context are treated as biological replicates (perhaps not ideal but this is merely a pilot experiment).

I'd like to estimate the effect of time on irradiated-cell gene expression where gene expression from each cell-line would be the irradiated sample at each time point relative to its corresponding the non-irradiated time=0 sample. I guess that if my response was log fold-change (log.fc) of a certain time point relative to time 0, the model would be: log.fc ~ time and if I wanted to adjust for cell-line it would be: log.fc ~ time + cell.line

Here's my targets data.frame (Cy3 lists the irradiated samples and Cy5 the non-irradiated samples):

targets <- data.frame(SlideNumber = rep(1:3,3),
                      FileName=c("/data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_1.txt",
                                 "/data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_2.txt",
                                 "/data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_3.txt",
                                 "/data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_1.txt",
                                 "/data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_2.txt",
                                 "/data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_3.txt",
                                 "/data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_1.txt",
                                 "/data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_2.txt",
                                 "/data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_1.txt"),
                      Cy3 = c("xray.24_1","xray.48_1","xray.72_1",
                              "xray.24_2","xray.48_2","xray.72_2",
                              "xray.24_3","xray.48_3","xray.72_3"),
                      Cy5 = c("ctrl.0_1","ctrl.0_1","ctrl.0_1",
                              "ctrl.0_2","ctrl.0_2","ctrl.0_2",
                              "ctrl.0_3","ctrl.0_3","ctrl.0_3"),
                      cell.line = c(rep(1,3),rep(2,3),rep(3,3)),
                      time=rep(c(24,48,72),3),
                      stringsAsFactors = F)

I read the scanned files and did the normalizations using this code:

RG <- limma::read.maimages(targets$FileName, source="agilent.median")
RG.bg.corrected <- limma::backgroundCorrect(RG,method="normexp",offset=16)
RG.bg.within.array.corrected <- limma::normalizeWithinArrays(RG.bg.corrected,method="loess")
RG.bg.between.array.corrected <- limma::normalizeBetweenArrays(RG.bg.within.array.corrected,method="Aquantile")

And for setting up a design matrix for the log.fc ~ time model, I tried:

library(dplyr)
design <- model.matrix(y ~ time, data=targets %>% dplyr::mutate(y=rnorm(9)))

But I'm getting this error when trying to run lmFit:

Error in dimnames(x) <- dn : 
  length of 'dimnames' [1] not equal to array extent

And for the model adjusting for cell-line I set up this design matrix:

targets$cell.line <- factor(targets$cell.line,levels=1:3)
design <- model.matrix(y ~ time+cell.line,contrasts=c(NA,"contr.treatment"),data=targets %>% dplyr::mutate(y=rnorm(9)))

But running lmFit still produces the same error.

Any idea what the problem is or what should the design matrix be?

Here are the first 5 rows of RG.bg.between.array.corrected

> RG.bg.between.array.corrected[1:5,]
An object of class "MAList"
$targets
                                                          FileName
1 /data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_1.txt
2 /data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_2.txt
3 /data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_3.txt
4 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_1.txt
5 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_2.txt
6 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_3.txt
7 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_1.txt
8 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_2.txt
9 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_3.txt

$genes
  Block Column Row            Name                    ID RefNumber ControlType        GeneName TopHit Description
1     1      1   1 GE_BrightCorner       GE_BrightCorner     62976         pos GE_BrightCorner     NA          NA
2     1      2   1 GE_BrightCorner       GE_BrightCorner     62648         pos GE_BrightCorner     NA          NA
3     1      3   1      DarkCorner            DarkCorner     62320         pos      DarkCorner     NA          NA
4     1      4   1      DarkCorner            DarkCorner     61992         pos      DarkCorner     NA          NA
5     1      5   1  XP_004869187.1 CUST_5675_PI439843739     61664       false                     NA          NA

$source
[1] "agilent.median"

$printer
$ngrid.r
[1] 1

$ngrid.c
[1] 1

$nspot.r
[1] 384

$nspot.c
[1] 164

$M

     /data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_1 /data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_2 /data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_3



[1,]                                                 -0.003804927                                                   0.07581274                                                   0.32692028
[2,]                                                 -0.100439724                                                  -0.12135554                                                   0.10399374
[3,]                                                 -0.100439724                                                  -0.30350304                                                   0.14603983
[4,]                                                 -0.016358922                                                   0.18537314                                                   0.71029227
[5,]                                                  0.000278216                                                  -0.07288553                                                  -0.02972195
     /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_1 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_2 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_3

[1,]                                                   0.09153339                                                  0.191030075                                                   0.30743532
[2,]                                                  -0.11118199                                                  0.008799683                                                  -0.04141555
[3,]                                                   0.03102885                                                 -0.159041326                                                  -0.26419409
[4,]                                                  -0.24002567                                                  0.103334305                                                   0.04191329
[5,]                                                  -0.21634197                                                 -0.344161687                                                  -0.46036156
     /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_1 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_2 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_3

[1,]                                                  0.063933997                                                   0.02578673                                                   0.17873555
[2,]                                                  0.040366462                                                   0.08211603                                                  -0.03525927
[3,]                                                 -0.045199954                                                   0.05364567                                                  -0.18528808
[4,]                                                  0.567896630                                                   0.84323869                                                   1.03922721
[5,]                                                  0.004504603                                                  -0.12057734                                                  -0.08462349

$A

     /data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_1 /data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_2 /data/SG15074465_258552110006_S001_GE2_1200_Jun14_PEG200_1_3

[1,]                                                    11.694533                                                    12.125544                                                    11.851844
[2,]                                                     4.587902                                                     4.517597                                                     4.493085
[3,]                                                     4.587902                                                     4.623010                                                     4.514404
[4,]                                                     7.225362                                                     7.568576                                                     7.128799
[5,]                                                     8.326830                                                     8.414536                                                     8.234831

     /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_1 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_2 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_1_3

[1,]                                                    13.510580                                                    12.466787                                                    11.687914
[2,]                                                     6.000278                                                     4.472188                                                     4.542115
[3,]                                                     4.720447                                                     4.480202                                                     4.499401
[4,]                                                     8.146038                                                     7.809378                                                     7.921780
[5,]                                                     7.795751                                                     8.051800                                                     7.590878

     /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_1 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_2 /data/SG15074465_258552110010_S001_GE2_1200_Jun14_PEG200_2_3

[1,]                                                    11.745827                                                    11.204838                                                    12.069992
[2,]                                                     4.449941                                                     4.588539                                                     4.472029
[3,]                                                     4.493532                                                     4.523331                                                     4.576514
[4,]                                                     7.656538                                                     7.598940                                                     7.566575
[5,]                                                     7.777802                                                     7.843019                                                     7.658971

limma microarray design matrix twochannel • 1.3k views

ADD COMMENT • link updated 5.7 years ago by Gordon Smyth 50k • written 5.7 years ago by rubi ▴ 110

0

Entering edit mode

It is unnecessary to add random numbers to a targets file. To make a design matrix, you can simply use:

design <- model.matrix(~time, data=targets)

ADD REPLY • link 5.7 years ago Gordon Smyth 50k

score 1 · Answer 1 · 2018-08-04

The error arises from annotation columns, not from the design matrix

This error doesn't have anything to do with the design matrix. This particular error arises when one tries to add rownames to a matrix and the vector of rownames doesn't agree with the row dimension of the matrix. If this error has indeed come from lmFit (and that seems strange to me), then the linear models have been fitted correctly but the error has occurred when trying to adding rownames to the fitted model object at the end of the computation.

The output you show doesn't match the code

There are some strange things going on with your code. The MAList object RG.bg.between.array.corrected that you show cannot have been created by the limma code that you give. In particular, the columns called ID, RefNumber and TopHit are all non-standard annotation column names and hence cannot have been read in by the default read.maimages() command that you give.

Another issue is that your annotation columns contain NA values, and I have never seen NA values from an Agilent file -- missing values are instead represented as empty strings. I conclude that you must have added columns to the data object yourself independently of the limma functions.

Another issue is that you haven't actually shown the code that gives the error.

What's going on?

I was guessing that you have edited the data object using code that you don't show and that there are problems with it, for example columns not all of the same length.

Eventually it became apparent that you have set the genes component of the MAList independently of the code shown in your question, and that your genes data.frame has more rows than there are probes on your microarray. In other words, your MAList is invalid because the number of rows of the different components don't agree. This leads to an error when lmFit tries to set the rownames of the fitted model object.

The lesson is that you should read the annotation columns using read.maimages as explained in the limma User's Guide (Section 4.4) and not try to set them yourself. It's quicker and easier all round.

My original answer

(For the record, the rest is my original answer before more information was added to the question.)

Well, there a lot of things going on here and you're still some way from making a start on analysing this dataset. I'll try to point you in the right direction:

What analysis are you trying to do? Normally it wouldn't be enough to regress expression on time ignoring cell line. Are you perhaps trying to regress on time adjusting for cell line? Are you interested in the differences between cell lines? What is your biological question?
Where exactly is your data? Your code doesn't show any object containing any expression data. You claim to have run the lmFit() function, but on what data? How have you read the expression data into R? How have you background corrected and normalized?

The last point is the important. Without knowing what form of expression data you are trying to analyse, it isn't possible to advise you how to form the design matrix.