Hi
I am trying to analyse a microarray dataset from GEO.
The data is already normalized and log2 transformed. I have 30 paired patient samples of benign prostate epithelia and stroma tissue. The data is grouped like this:
Patient | Treatment |
1 | B |
1 | sB |
2 | B |
2 | sB |
3 | B |
3 | sB |
.... |
|
|
I have big trouble finding out if how to design my contrast matrix. I want to identify differentially expressed genes between epithelia and stroma. I have read the examples in the Limma User Guide, p 41, but my design matrix is more complicated than the expamles in the User Guide. Also I am confused about the formulation of substrating in the contrast matrix, when you want to find differentially expressed gene, like the example below:
contrast.matrix <- makeContrasts(Benign_epithelvsBenign_stroma = Benign_epithel - Benign_stroma, levels = design)
1) Is this the coding for devision of the two expression means, that gives the logFC?
1) Can you tell me if my contrast matrix is correct?
2) What is the point of using "~" in the design matrix? Or said in an other way, what is the point of not wanting an intercept?
3) I have seen examples of people adding "coef=1" to their topTable function. What does this mean?
4) what does the "intercept" column in the design matrix mean?
Kind regards, Stine
Here is all my coding:
> Array_benigntissue <- read.table("30sB_30B_samplesPAIRED.csv", header = T, sep = ";", stringsAsFactors = T, dec = ",", skip = 1)
> #Convert dataframe into matrix:
> Array_benigntissue2 <- as.matrix(sapply(Array_benigntissue,as.numeric))
> Array_benigntissue2 <- Array_benigntissue[,-1]
> rownames(Array_benigntissue2) <- Array_benigntissue[,1]
> #Construct design matrix
> pairing <- factor(rep(1:30, each = 2))
> tissue <- factor(rep(c("B", "sB"), 30))
> design <- model.matrix(~pairing+tissue)
> design
(Intercept) pairing2 pairing3 pairing4 pairing5 pairing6 pairing7 pairing8 pairing9
1 1 0 0 0 0 0 0 0 0
2 1 0 0 0 0 0 0 0 0
3 1 1 0 0 0 0 0 0 0
4 1 1 0 0 0 0 0 0 0
5 1 0 1 0 0 0 0 0 0
6 1 0 1 0 0 0 0 0 0
7 1 0 0 1 0 0 0 0 0
8 1 0 0 1 0 0 0 0 0
9 1 0 0 0 1 0 0 0 0
.....
pairing28 pairing29 pairing30 tissuesB
1 0 0 0 0
2 0 0 0 1
3 0 0 0 0
4 0 0 0 1
5 0 0 0 0
6 0 0 0 1
7 0 0 0 0
8 0 0 0 1
9 0 0 0 0
10 0 0 0 1
11 0 0 0 0
12 0 0 0 1
13 0 0 0 0
14 0 0 0 1
15 0 0 0 0
16 0 0 0 1
17 0 0 0 0
18 0 0 0 1
19 0 0 0 0
20 0 0 0 1
21 0 0 0 0
22 0 0 0 1
23 0 0 0 0
24 0 0 0 1
25 0 0 0 0
26 0 0 0 1
27 0 0 0 0
28 0 0 0 1
29 0 0 0 0
30 0 0 0 1
31 0 0 0 0
32 0 0 0 1
attr(,"assign")
[1] 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2
attr(,"contrasts")
attr(,"contrasts")$pairing
[1] "contr.treatment"
attr(,"contrasts")$tissue
[1] "contr.treatment"
> fit = lmFit(Array_benigntissue2[,4:63], design)
It seems your question relates to the limma package, so you should include the limma tag on your post.