Question

Removing control probes and treating technical replicates in Agilent one colour data

0

Entering edit mode

eldridb@stud.ntnu.no ▴ 10

@eldridbstudntnuno-2751

Last seen 9.6 years ago

Dear all, I am working with single channel Agilent arrays, and need some help in processing and the data correctly, using Limma. (I have followed the advice I found in the archives for using Limma for one colour Agilent data, by using dummy variables for the red channel.) I started out weighting all the control probes and spots flagged as outliers by Feature Extraction to zero when loading the data. The differentially expressed genes I found downstream seemed logical. But then I understood that the control probes should not be weighted zero, but should be included in the normalisation procedure, and instead removed before statistical analysis to increase power. However, when I tried to subset the expression object in lmFit(), the downstream result was a very different and shorter genelist. Also, the second genelist includes some control probes (the first one doesn?t), which should not be possible, since all the control probes were supposed to be removed. I have 38 arrays with 36 biological and 2 technical replicates. I?ve loaded the .txt files with RG <- read.maimages(). I normalized between arrays with vsn. I won?t go into the design and contrast matrices, because I don?t think it?s relevant for my question. I?ve tried to include the information from the technical replicates by using duplicateCorrelation() and defined which samples are biological replicates as a column (Biolrep= 1,2,3,4,4,5,6,...,36) in the targets file. Please tell me if the information I?m giving is insufficient. Can anyone see any mistakes in my code, or otherwise give an explanation to why I am not able to remove the control probes correctly or other reasons for the varying results? I am new to Bioconductor and fairly new to microarray analysis, so I might be doing a very basic mistake. I would also appreciate comments on my treatment of the technical replicates, because this also seems to change downstream analysis to some extent (in comparison to just leaving out the two technical replicates of the analysis). RG <-read.maimages(...) Gvsn <- normalizeBetweenArrays(RG$G,method="vsn") #Replicate structure biolrep <- targets$Biolrep corfit<- duplicateCorrelation(Gvsn, design=design, ndups=1, block=biolrep) #Pick the probes that are not controls isGene <- RG$genes$ControlType == 0 fit <- lmFit(Gvsn[isGene,], design, weights=RG$weights[isGene,], ndups=1, block=biolrep, cor=corfit$consensus) fit2 <- contrasts.fit(fit,cont.matrix) fit2 <- eBayes(fit2) And this is the flag function I use before reading the data, which is stores in RG$weights is: FlagFunction <- function(flag) { ##Weight only present spots 1, flagged spots 0. present <- x$gIsPosAndSignif == 1 y <- as.numeric(present & manual) ##Remove flagged spots sat <- x$gIsSaturated == 0 featureOL1 <- x$gIsFeatNonUnifOL == 0 featureOL2 <- x$gIsFeatPopnOL == 0 flagged <- (sat & featureOL1 & featureOL2) flagged <- grep(FALSE, flagged) good <- grep(TRUE, y==1) flagged <- intersect(flagged, good) y[flagged] <- 0 y } I would be very grateful for any comments, preferably as soon as possible since I have quite a time pressure to finish this analysis. Regards, Eldrid Borgan

Microarray GO vsn limma Microarray GO vsn limma • 1.6k views

ADD COMMENT • link 16.0 years ago eldridb@stud.ntnu.no ▴ 10