design and contrast matrix for limma (time course with 2 different genotypes)
1
0
Entering edit mode
ijvechetti ▴ 10
@ijvechetti-20701
Last seen 17 months ago
United States

Hi everyone,

I'm new on bioinformatics and I'm having a little trouble to design a matrix to be able to compare my time course (0,1,3,5 and 7 days) in 2 different mice genotype (Wild type and Knockout). I have a replicate in all the time points.

I want to understand what genes (microarray) are DE during my time course in the wild type as well as in Knockout, and to know what are the DEG that are different between wild type and knockout. Due to the lack of knowledge, I cannot figure out how to create a correct matrix and the correct contrast I should do.

Any guide would be much appreciated.

Ivan

limma microarray • 1.8k views
ADD COMMENT
1
Entering edit mode

Your experiment seems almost exactly the same as covered in Section 9.6.1 (Time Course Experiments) of the limma User's Guide.

Aaron has outlined a nested interaction approach to forming the design matrix (see Section 9.5.3). You could either follow Aaron's code or follow Section 9.6.1 -- both ways will give the same results in the end.

ADD REPLY
0
Entering edit mode

Thank you for the input.

ADD REPLY
2
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 7 hours ago
The city by the bay

So your experiment looks like this:

times <- rep(c(0, 1, 3, 5, 7), 2)
geno <- rep(c("WT", "KO"), each=5)

There are a few ways you can formulate your design matrix, but I would go with:

design <- model.matrix(~0 + geno + times:geno)

The first two coefficients represent the intercept in each genotype, while the next two represent the slope of change with respect to time (in days) within each genotype. So, you're effectively fitting a line to the (log-)expression across time, and doing that separately for each genotype.

The main assumption is, of course, that the change in log-expression is linear with respect to time. I've also assumed that each time point comes from an independent replicate; if all time points were collected from the same mouse, then you effectively have n=1 and would not be able to make any population inferences.

To test for the effect of time in each genotype, simply drop the relevant coefficient (genoKO:times or genoWT:times) in topTable. If you want to test for differences in the time effect between genotypes:

colnames(design) <- make.names(colnames(design)) # make names syntactically vald
con <- makeContrasts(genoKO.times - genoWT.times, levels=design)

... and then go through contrasts.fit and eBayes.

P.S Don't use the "Tutorial" tag; that is intended for people who are writing and publishing tutorials, rather than those who are asking for help. The latter is just a normal question on this site.

ADD COMMENT
0
Entering edit mode

Thank you so much for your help. I will follow your and Gordon comments and try to do my stats. I appreciate the input.

ADD REPLY
0
Entering edit mode

Hello Aaron, sorry for this type of question, I'm getting an error when I try to fit my data with the design (I'm sure is something really stupid and easy to fix, but I cannot figure out). Here what I'm trying to do:

dat=read.table("Array.txt",header=TRUE,sep="\t", row.names=1)
dat = as.matrix(dat)
time<- rep(c(0, 1, 3, 5, 7),2)
geno<- rep(c("WT", "KO"), each=5)
design<- model.matrix(~0 + geno + time:geno)
colnames(design)<- make.names(colnames(design))
fit <- lmFit(dat, design)
Error in lmFit(dat, design) : 
  row dimension of design doesn't match column dimension of data object.

It is a problem in my excel spreadsheet organization? How could I fix that problem?

Thanks once again

Ivan

ADD REPLY
0
Entering edit mode

The error is pretty self-explanatory. The number of columns in dat should be equal to the number of rows in design, and both of them should be equal to the number of samples in your experiment. If this is not the case, lmFit can't be expected to work out which columns are samples and which are not. Look at head(dat) and figure it out yourself.

ADD REPLY
0
Entering edit mode

I saw the problem. In my list, I have 2 replicate per time (0,1,3,5,7) but 2 genotypes (Ko, WT), so would be a total of 20, not 10 as the previous model.matrix. However, if I do:

time<- rep(c(0, 1, 3, 5, 7),2)

geno<- rep(c("WT", "KO"), each=10)

design<- model.matrix(~0 + geno + time:geno) Error in model.frame.default(object, data, xlev = xlev) : variable lengths differ (found for 'time')

I'm new on this and I cannot figure out. I try to do

time<- rep(c(0, 1, 3, 5, 7),4)

geno<- rep(c("WT", "KO"), each=10)

it goes thru, but then everything is significant. I'm trying to understand so that in the future I can do myself.

ADD REPLY
0
Entering edit mode

In my list, I have 2 replicate per time (0,1,3,5,7) but 2 genotypes (Ko, WT), so would be a total of 20, not 10 as the previous model.matrix.

You should consider being more precise in your original post, where you wrote "I have a replicate in all the time points."

I'm new on this and I cannot figure out. I try to do

Okay. Stop. It sounds like you're just trying bits of code without really understanding what's going on. While I could give you a code snippet that runs, neither you nor I would have any confidence that it is giving you the correct results. I strongly suggest you talk to a local bioinformatician or statistician to get some face-to-face help. This support site is for specific questions about Bioconductor software, not for learning about R/statistics, and the latter seems to be the current challenge here.

ADD REPLY
0
Entering edit mode

I really thought I had given clear information about my design, but I apologize since it is clear that I haven't. As I said, I've been analyzing microarray data with 2 groups and never had a problem before. My real problem is with my experimental conditions I don't understand how to design a correct matrix to be able to do the contrasts. The first design you have sent me, helped a lot because I could understand how this is done. I have also been able to get the comparisons I wanted using Section 9.6.1 as Gordon suggested. I will make sure the next time I'm clear about my experimental design. Thanks for your help and time

ADD REPLY

Login before adding your answer.

Traffic: 833 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6