Question

limma questions (normalization)

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 6 hours ago

WEHI, Melbourne, Australia

Dear Brooke, >Date: Tue, 18 Sep 2007 17:28:23 -0400 (EDT) >From: Brooke LaFlamme <bal44 at="" cornell.edu=""> >Subject: [BioC] limma questions (normalization) >To: bioconductor at stat.math.ethz.ch > >Hi, I am very new to bioconductor and microarray analysis in >general. I am using Windows XP and R version 2.5.1. > >I have a few questions about the limma package that I am using for >cDNA array analysis. > >First: I would like to flag out all spots with fluorescence >intensity lower than that of the mean background intensity of the array. Personally, I think this sort of spot filtering is almost always a bad idea. It can make sense to remove probes from the downstream differential expression analysis which have very low intensity on every array, but removing them before the normalization step is usually counter productive. Filtering out individual spots just seems bad from every point of view. You are implying that a low intensity spot is "low quality", but this is not necessarily so. Usually it is telling you that the probe is not expressed, which is information you should preserve rather than throw out. If you're new to microarray analysis, it might be a good idea to follow one of the case study examples closely, which use recommended methods, instead of trying to put together your own ad hoc analysis. > The weight function I am using to do this is (using Genepix output > file (*.gpr) column names): > >myfun<-function(x){ > okred<-mean(x[,"B635"])-x[,"F635 Mean"]<=0 > okgreen<-mean(x[,"B532"])-x[,"F532 Mean"]<=0 > as.numeric(okgreen & okred) > } > >Does this actually do what I want? I?m not certain of how to >calculate ?mean background intensity? for the whole array. I assume you're trying to use the wt.fun argument of read.maimages(). You cannot use this approach to compute the mean background intensity of the array. You need to read in the whole array before you can do that. >Also, when spots are given a weight of zero, are only spots that >have weight >0 across all replicate arrays included in the analysis? No. If you give an individual spot a weight of zero, then that individual spot is removed from the analysis. The corresponding probe will remain in the analysis if it has any spots with weight>0. > How do I find the number of spots included in the analysis? I know > how to do this for individual arrays but if arrays 1-4 (out of 16) > are in group 1, how do I find how many spots are included in the > analysis for group 1? colSums(RG$weights==0) >Second: Two of my array groups consistently have all significant >spots being ?downregulated? relative to the control. I feel that >there must be an error in my normalization step or background >correction step. This is the code I am using: > >#Background correction and normalization > RG2<-backgroundCorrect(RG, method="normexp", offset=50) > w<-modifyWeights(RG2$weights, RG2$genes$Status, > c("Actin","Buffer"), c(2,2)) > MA<-normalizeWithinArrays(RG2, method="loess", weights=w) > MA.Aq<-normalizeBetweenArrays(MA) >#look at array weights > arrayw<-arrayWeightsMA.Aq) >#Create the design matrix for linear models > design<-modelMatrix(targets, ref="control") >#fit a linear model to each gene > fit<-lmFitMA.Aq, design=design, weights=arrayw) >#create a contrast matrix to compare each treatment to the control case > contrast.matrix<-makeContrasts(Acp26Aa, Acp29Ab, Acp36DE, > Acp62F, levels=design) >#expand linear model and compute empirical Bayes statistics > fit2<-contrasts.fit(fit, contrast.matrix) > fit2<-eBayes(fit2) The best way to judge if your background correction and normalization has been successful is to look at your data, before and after normalization, using diagnostic plots such as plotMA() which are described in the limma User's Guide. >Is normexp the best way to do background correction? Usually it is, see http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btm412 . But I would not use any method on microarray data without scrutinising the results, as described above. >Also, this array (DGRC-1) is printed with 48 pins, so I?m pretty >sure I can?t use printtiploess. Is this true? Why not? We have regularly use printtiploess with 48 pins, although there are other possibilities, like robustspline. >Finally, what is the underlying linear model that lmFit is using? I >can find no information on this. You don't mention having looked at any of the documentation. >I assume it treats treatment and dye as fixed effects and array as a >random effect, but I just want to confirm this. I think that have you read literature about a different method. It would help for you to say what you have read. The limma function lmscFit() would treat arrays as random, but you are using lmFit(), which treats everything as fixed, unless you use the block argument. In your case, you are analysing log-ratios, so any array effect has already been removed from the data. So there is no array effect in the model. There also is no dye effect in your model, because you didn't add one. See section 8.1.2 of the limma User's Guide. A dye effect would only make sense anyway if you have dye swaps in your experimental design. Best wishes Gordon >Thank you so much for any help demystifying this. Sorry this was so long. > >Brooke L.

Microarray Normalization probe limma Microarray Normalization probe limma • 889 views

ADD COMMENT • link 16.6 years ago Gordon Smyth 50k