Question: Using LIMMA for differential gene analysis of microarray data
0
gravatar for Francesca McGrath
9 weeks ago by
Australia
Francesca McGrath 0 wrote:

Hi Forum, 

I'm trying to determine genes differentially expressed between a disease group (n=16), and healthy controls (n=10), using the limma package. I'm wanting to generate mainly adjusted p values and logFC for all genes, so that I can then sort them by significance on another platform (excel). My data is from a microarray experiment (Affymetrix Human Gene Array) stored in the GEO repository. The issue I'm having is that results generated differ from previous analysis done on the same samples, so I believe that as a (very) new user I have missed some (or more likely many) steps. 

I manually downloaded a GEO supplementary file (GSE...Normalised_data_with_annotation) and then imported the dataset into R. I do not believe this dataset has been log transformed. From this I created a text file with groups of interest ("disease_healthy") and imported it into R: The first 16 columns are disease patients and the remaining 10 are controls. There are 25582 rows of genes. To interrogate the data I used the following steps after loading the limma package:

> disease_healthy <- read.delim("~/disease_healthy.txt", header=FALSE)
>   View(disease_healthy)
> rma.file <- read.table("disease_healthy.txt", header=FALSE) 
> eset <- new("ExpressionSet", exprs=as.matrix(rma.file))
> design <-model.matrix(~0+factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0))) 
> colnames(design) <- c("disease", "healthy") 
> fit <- lmFit(eset, design)
> contrast.matrix<-makeContrasts(d.vs.h=disease-healthy, levels=design)
> fit2<-contrasts.fit(fit, contrast.matrix)
> fit2.eBayes<-eBayes(fit2)
> topTable(fit2.eBayes,coef="d.vs.h",n=Inf,sort.by="none",adjust.method="fdr")

I'm not sure whether inconsistencies with previous analysis are because the data needed further alterations before the steps I took here, or if I'm completely off base with my analysis. I also used the same steps with log2 transformed data but this didn't help.

Any help/direction that could be provided would be much appreciated!

ADD COMMENTlink modified 9 weeks ago by Aaron Lun22k • written 9 weeks ago by Francesca McGrath 0

I wonder what "previous analyses" you are referring to, since GEO doesn't describe or store analyses?

ADD REPLYlink written 9 weeks ago by Gordon Smyth36k
Answer: A: Using LIMMA for differential gene analysis of microarray data
2
gravatar for Aaron Lun
9 weeks ago by
Aaron Lun22k
Cambridge, United Kingdom
Aaron Lun22k wrote:

Reproducing existing analyses from a brief textual description is notoriously difficult, and part of the reason why journals mandate code availability as part of the submission process. Updates to packages or R itself can also lead to differences. So unless you've got the code that they used for the previous analyses, the best you can do is to guess.

With that in mind, the limma part of your analysis looks mostly fine, assuming that (i) the data are already background-corrected, normalized and log-transformed, and (ii) the design matrix is correct with no additional blocking factors required. You may also need to do some filtering or array weighting, read the limma user's guide for details.

ADD COMMENTlink written 9 weeks ago by Aaron Lun22k

Thank you for your answer! 

All the modifications you mentioned have already been done, so I think I'll run with the analysis I have. 

ADD REPLYlink written 9 weeks ago by Francesca McGrath 0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 418 users visited in the last hour