Search
Question: Using LIMMA for differential gene analysis of microarray data
0
gravatar for Francesca McGrath
13 days ago by
Australia
Francesca McGrath 0 wrote:

Hi Forum, 

I'm trying to determine genes differentially expressed between a disease group (n=16), and healthy controls (n=10), using the limma package. I'm wanting to generate mainly adjusted p values and logFC for all genes, so that I can then sort them by significance on another platform (excel). My data is from a microarray experiment (Affymetrix Human Gene Array) stored in the GEO repository. The issue I'm having is that results generated differ from previous analysis done on the same samples, so I believe that as a (very) new user I have missed some (or more likely many) steps. 

I manually downloaded a GEO supplementary file (GSE...Normalised_data_with_annotation) and then imported the dataset into R. I do not believe this dataset has been log transformed. From this I created a text file with groups of interest ("disease_healthy") and imported it into R: The first 16 columns are disease patients and the remaining 10 are controls. There are 25582 rows of genes. To interrogate the data I used the following steps after loading the limma package:

> disease_healthy <- read.delim("~/disease_healthy.txt", header=FALSE)
>   View(disease_healthy)
> rma.file <- read.table("disease_healthy.txt", header=FALSE) 
> eset <- new("ExpressionSet", exprs=as.matrix(rma.file))
> design <-model.matrix(~0+factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0))) 
> colnames(design) <- c("disease", "healthy") 
> fit <- lmFit(eset, design)
> contrast.matrix<-makeContrasts(d.vs.h=disease-healthy, levels=design)
> fit2<-contrasts.fit(fit, contrast.matrix)
> fit2.eBayes<-eBayes(fit2)
> topTable(fit2.eBayes,coef="d.vs.h",n=Inf,sort.by="none",adjust.method="fdr")

I'm not sure whether inconsistencies with previous analysis are because the data needed further alterations before the steps I took here, or if I'm completely off base with my analysis. I also used the same steps with log2 transformed data but this didn't help.

Any help/direction that could be provided would be much appreciated!

ADD COMMENTlink modified 12 days ago by Aaron Lun21k • written 13 days ago by Francesca McGrath 0

I wonder what "previous analyses" you are referring to, since GEO doesn't describe or store analyses?

ADD REPLYlink written 12 days ago by Gordon Smyth35k
2
gravatar for Aaron Lun
12 days ago by
Aaron Lun21k
Cambridge, United Kingdom
Aaron Lun21k wrote:

Reproducing existing analyses from a brief textual description is notoriously difficult, and part of the reason why journals mandate code availability as part of the submission process. Updates to packages or R itself can also lead to differences. So unless you've got the code that they used for the previous analyses, the best you can do is to guess.

With that in mind, the limma part of your analysis looks mostly fine, assuming that (i) the data are already background-corrected, normalized and log-transformed, and (ii) the design matrix is correct with no additional blocking factors required. You may also need to do some filtering or array weighting, read the limma user's guide for details.

ADD COMMENTlink written 12 days ago by Aaron Lun21k

Thank you for your answer! 

All the modifications you mentioned have already been done, so I think I'll run with the analysis I have. 

ADD REPLYlink written 12 days ago by Francesca McGrath 0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 426 users visited in the last hour