Question

Two-class unpaired test using LIMMA

0

Entering edit mode

Bade ▴ 310

@bade-5877

Last seen 3.4 years ago

Delaware

Hi All,

I want to identify genes that are significantly associated with normally developing tissues. The data sets which I have can be divided into two classes:

1. Normally developing - Samples from wild type tissues

2. Impaired development - Samples from mutants

Both classes have samples from different stages, some stages are common and some not. So, I was thinking of performing a 'two-class unpaired test' similar to the SAM (Significance analysis of Microarrays). Is there a way to perform similar test using Limma package (Bioconductor)?

Sample Code:

AF.mydata <- ReadAffy()
AF.esetRMA<-rma(AF.mydata) 
AF.sampletype <- c('Norm','Norm','Norm','Mut','Mut','Mut','Mut',)
AF.group <-factor(AF.sampletype)
AF.fit <- lmFit(AF.esetRMA,AF.design)
AF.contrast.matrix<-makeContrasts(Norm-Mut,levels=AF.design)
AF.fit2<- contrasts.fit(AF.fit,AF.contrast.matrix)
AF.fit2 <- eBayes(AF.fit2)

topTable(AF.fit2, coef = 1, adjust = "fdr", number = 10,sort.by='logFC')

Would this be right thing to do?

Bade

Limma SAM differential expression • 2.5k views

ADD COMMENT • link updated 9.4 years ago by James W. MacDonald 65k • written 9.4 years ago by Bade ▴ 310

1

Entering edit mode

You don't show how you create your design matrix, so it's hard to say if you need a contrasts matrix or not. I'm assuming you did something like

AF.design <- model.matrix(~ 0 + AF.group)

colnames(AF.design) <- gsub("AF.group", "", colnames(AF.design))

In which case you are doing an unpaired t-test for each gene, and you do need a contrasts matrix.

However, you say that both classes 'have samples from different stages', in which case this analysis is ignoring that fact. In other words, this analysis is ignoring any differences that exist between replicate samples of each type. So if the four Mut samples are somehow different, then you are ignoring that fact (and the same is true for the Norm samples).

Also note that sorting your topTable() results by fold change doesn't necessarily put the genes with the most evidence for differential expression at the top of the list. A large fold change doesn't in and of itself imply the most significant difference.

ADD REPLY • link 9.4 years ago James W. MacDonald 65k

0

Entering edit mode

Hi James,

Sorry I forgot to put in the replicates info. Here is the modified code:

AF.mydata <- ReadAffy()
AF.esetRMA<-rma(AF.mydata)
AF.sampletype <- c('Norm1','Norm1','Norm2','Norm2','Norm3','Norm3'
'Mut1','Mut1','Mut2','Mut2','Mut3',Mut3','Mut4','Mut4')
AF.group <-factor(AF.sampletype)
AF.design<- model.matrix(~0+AF.group)
AF.fit <- lmFit(AF.esetRMA,AF.design)

Now I have replicates in my design but then how can I make a single comparison between 'Norm' and 'Mut'? I need to compare between both groups, something like this:

AF.contrast.matrix<-makeContrasts(Norm-Mut,levels=AF.design) ## This might not be correct thing to do
AF.fit2<- contrasts.fit(AF.fit,AF.contrast.matrix)
AF.fit2 <- eBayes(AF.fit2)
topTable(AF.fit2, coef = 1, adjust = "fdr", number = 10,sort.by='t') ## This is just to complete dummy code

Also, I am not sure if performing an unpaired t-test would be the best thing in my case. Can you suggest some design or test using Limma which is analogous to SAM two-class unpaired test? Which picks up genes whose mean expression is different across 2 groups of samples (analogous to one-way ANOVA) .

Thanks

Bade

ADD REPLY • link updated 9.4 years ago by Gordon Smyth 50k • written 9.4 years ago by Bade ▴ 310

score 3 · Accepted Answer · 2014-11-25

You cannot use makeContrasts() to make a contrast for levels that don't exist in your design. And writing dummy code that cannot possibly work isn't a useful thing to do, because, well, it won't work, so what's the point? You are much better off trying to say what you want using words rather than fake, wrong code.

Also, don't respond to answers with another answer (you are not answering anything). Use the ADD COMMENT link at the bottom of the answers if you want to clarify something about a previous answer.

So, back to the story. An unpaired t-test is exactly analogous to the SAM two-class unpaired test, so I am unsure why you think they are different (just as a one-way ANOVA with only two levels is identical to an unpaired t-test).

The design you are fitting is a 'cell means' model, which computes the mean for each group (Norm1, Norm2, etc), and which will allow you to make any comparison between any of the groups. So you can compare Norm1 to Norm2 or Norm1 to Mut3 or whatever. The underlying assumption would be that each of these groups is somehow different from the other groups (e.g., the Norm1 and Norm2 samples are both normals, but somehow different as well).

If you want to do a contrast of Norm - Mut, then you have to first fix the colnames of your design matrix, which will be

> colnames(AF.design)
[1] "AF.groupMut1"  "AF.groupMut2"  "AF.groupMut3"  "AF.groupMut4"
[5] "AF.groupNorm1" "AF.groupNorm2" "AF.groupNorm3"

By doing

> colnames(AF.design) <- gsub("AF.group", "", colnames(AF.design))
> colnames(AF.design)
[1] "Mut1"  "Mut2"  "Mut3"  "Mut4"  "Norm1" "Norm2" "Norm3"

As I mentioned in my first answer. You can now create a contrast matrix using those names

> makeContrasts((Mut1+Mut2+Mut3+Mut4)/4 - (Norm1+Norm2+Norm3)/3, levels = AF.design)
       Contrasts
Levels  (Mut1 + Mut2 + Mut3 + Mut4)/4 - (Norm1 + Norm2 + Norm3)/3
  Mut1                                                  0.2500000
  Mut2                                                  0.2500000
  Mut3                                                  0.2500000
  Mut4                                                  0.2500000
  Norm1                                                -0.3333333
  Norm2                                                -0.3333333
  Norm3                                                -0.3333333

This will compute the average difference between the Mut and Norm samples. Whether or not it makes sense to do so is something you have to decide.